17867, a novel human aminopeptidase

ABSTRACT

The present invention relates to a newly identified human aminopeptidase. The invention also relates to polynucleotides encoding the aminopeptidase. The invention further relates to methods using the aminopeptidase polypeptides and polynucleotides as a target for diagnosis and treatment in aminopeptidase-related disorders. The invention further relates to drug-screening methods using the aminopeptidase polypeptides and polynucleotides to identify agonists and antagonists for diagnosis and treatment. The invention further encompasses agonists and antagonists based on the aminopeptidase polypeptides and polynucleotides. The invention further relates to procedures for producing the aminopeptidase polypeptides and polynucleotides.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. Ser. No. 10/039,073, filed Dec.31, 2001 (pending), which is a continuation of U.S. Ser. No. 09/345,650,filed Jun. 30, 1999, now U.S. Pat. No. 6,362,324, which are herebyincorporated in their entirety by reference herein.

FIELD OF THE INVENTION

The present invention relates to a newly identified humanarninopeptidase. The invention also relates to polynucleotides encodingthe aminopeptidase. The invention further relates to methods using theaminopeptidase polypeptides and polynucleotides as a target fordiagnosis and treatment in aminopeptidase-related disorders. Theinvention further relates to drug-screening methods using theaminopeptidase polypeptides and polynucleotides to identify agonists andantagonists for diagnosis and treatment. The invention furtherencompasses agonists and antagonists based on the aminopeptidasepolypeptides and polynucleotides. The invention further relates toprocedures for producing the aminopeptidase polypeptides andpolynucleotides.

BACKGROUND OF THE INVENTION

Proteases may function in carcinogenesis by inactivating or activatingregulators of the cell cycle, differentiation, programmed cell death, orother processes affecting cancer development and/or progression.Consistent with the model involving protease activity and tumorprogression, certain protease inhibitors have been shown to be effectiveinhibitors of carcinogenesis both in vitro and in vivo.

Aminopeptidases (APs) are a group of widely distributed exopeptidasesthat catalyse the hydrolysis of amino acid residues from theamino-terminus of polypeptides and proteins. The enzymes are found inplant and animal tissue, in eukaryotes and prokaryotes, and in secretedand soluble forms. Biological functions of aminopeptidases includeprotein maturation, terminal degradation of proteins, hormone levelregulation, and cell-cycle control.

The enzymes are implicated in a host of conditions and disordersincluding aging, cancers, cataracts, cystic fibrosis and leukemias. Ineukaryotes, APs are associated with removal of the initiator methionine.In prokaryotes the methionine is removed by methionine aminopeptidasesubsequent to removal of the N-formyl group from the initiator N-formylmethionine, facilitating subsequent modifications such as N-acetylationand N-myristoylation. In E. coli AP-A (pepA), the xerB gene product isrequired for stabilization of unstable plasmid multimers.

APs are also involved in the metabolism of secreted regulatorymolecules, such as hormones and neurotransmitters, and modulation ofcell-cell interactions. In mammalian cells and tissues, the enzymes areapparently required for terminal stages of protein degradation, andEGF-induced cell-cycle control; and may have a role in protein turnoverand selective elimination of obsolete or defective proteins.Furthermore, the enzymes are implicated in the supply of amino acids andenergy during starvation and/or differentiation, and degradation oftransported exogenous peptides to amino acids for nutrition. Asleukotriene A4 hydrolase may be an aminopeptidase, APs may further havea role in inflammation. Industrial uses of the enzymes includemodification of amino termini in recombinantly expressed proteins. SeeA. Taylor (1993) TIBS 18: 1993:167-172.

A variety of aminopeptidases have been identified from a wide variety oftissues and organisms, including zinc aminopeptidase and aminopeptidaseM from rat kidney membrane; arginine aminopeptidase from liver;aminopeptidase N^(b) from muscle; leucine aminopeptidase (LAP) frombovine and hog lens and kidney; aminopeptidase A (xerB gene product)from E. coli; yscl APE1/LAP4 and aminopeptidase A (pep4 gene product)from S. cerevisiae; LAP from aeromonas; dipeptidase from mouse ascites;methionine aminopeptidase from salmonella, E. coli, S. cerevisiae andhog liver; and D-amino acid aminopeptidase from ochrobactrum anthropiSCRC C1-38.

Of these aminopeptidases, the structure of bovine lens leucineaminopeptidase (blLAP) is well characterized and consists of ahomohexamer synthesized as a large precursor, each monomer containingtwo thirds of the protein in a major lobe and one third in a minor lobe.The minor lobe contains the N-terminal 150 residues. All putative activesite residues, presumably also the inhibitor bestatin-binding site, arefound in the C-terminal lobe and include Ala-333, Asn-330, Leu360,Asp332, Asp255, Glu-334, Lys250, Asp273, Met454, Ala-451, Gly362,Thr-359, Met270, Lys262, Gly362 and Ile-421.

Many aminopeptidases are metalloenzymes, requiring divalent cations,with specificities for Zn²⁺ or Co²⁺; however, particular sites ofcertain aminopeptidases can readily utilize Mn²⁺ and Mg²⁺. Residues usedto ligand Zn²⁺ include the His His Glu and Asp Glu Lys configurations.In addition to bestatin, boronic and phosphonic acids, α-methylleucineand isoamylthioamide are identified as competitive inhibitors for mostaminopeptidases. See A. Taylor (1993) TIBS 18: 1993:167-172; Burley etal. (1992) J. Mol. Biol. 224:113-140; Taylor et al. (1993) Biochemistry32:784-790.

Aminopeptidases from various organisms and various tissues within anorganism have high degrees of primary sequence homology, as indicated byimmunological assays. Some enzymes also exhibit very similar kineticprofiles. Direct amino acids sequence comparison of blLAP andaminopeptidase-A from E. coli shows 18, 44 and 35% identity for theamino- and carboxy-terminals, and the entire protein, respectively. Thecomparison shows 46, 66, and 60% identity for the respective regions.See Burley et al. (1992) J. Mol. Biol. 224:113-140.

Bovine lens leucine aminopeptidase (biLAP), bovine kidney LAP, humanlens and liver LAPs, hog, lens, kidney and intestine LAPs, proline-AP,E. coli AP-A, AP-I and the S. typhimurium pepA gene product have beencategorized as belonging to the family of zinc aminopeptidases whichutilize the residues Asp Glu Lys for zinc binding and the active siteamino acid configuration described above for bovine LAP for substratebinding. This family, possibly also including Aeromonas LAP, issuggested to be distinguished from zinc proteases which utilize His HisGlu in zinc binding and Arg in substrate binding. The Saccharomycesmethionine-AP is characterized to contain two zinc finger like motifs inthe amino-terminus and shares little homology with biLAP. See A. Taylor(1993) TIBS 18: MAY 1993:167-171; Watt et al. (1989) J. Biol. Chem.264:5480-5487.

Leucine aminopeptidase expression is regulated at the transcriptionallevel, evidenced by enhancement of both activity and mRNA upon removalof serum in in vitro aged and/or transforming lens epithelial cells.Furthermore, LAP mRNA and protein is induced by interferon y in humanACHN renal carcinoma, A549 lung carcinoma, HS1 53 fibroblasts and A375melanoma. Regulation by development and growth is also implicated. TheE. coli pepN gene is transcriptionally regulated upon anaerobiosis andphosphate starvation. Membrane bound AP-N (CD13) is expressed in alineage-restricted manner by subsets of normal and malignant cells,apparently through regulation by physically distinct promoters.Expression of the yeast yscl product APEI is dependent upon the levelsof yscA and PEP4 gene products. Synthesis of APE1 is sensitive to mediaglucose levels, and the activity of yeast aminopeptidase is sensitive tosubstitution of ammonia rather than peptone as the source of nitrogen.See Harris et al. (1992) J. Biol. Chem. 267:6865-6869; Jones et al.(1982) Genetics 102:665-677.

Accordingly, aminopeptidases are a major target for drug action anddevelopment. Therefore, it is valuable to the field of pharmaceuticaldevelopment to identify and characterize previously unknownaminopeptidases. The present invention advances the state of the art byproviding a previously unidentified human aminopeptidase.

SUMMARY OF THE INVENTION

It is an object of the invention to identify novel aminopeptidases.

It is a further object of the invention to provide novel aminopeptidasepolypeptides that are useful as reagents or targets in aminopeptidaseassays applicable to treatment and diagnosis of aminopeptidase-relateddisorders.

It is a further object of the invention to provide polynucleotidescorresponding to the novel aminopeptidase polypeptides that are usefulas targets and reagents in aminopeptidase assays applicable to treatmentand diagnosis of aminopeptidase-related disorders and useful forproducing novel aminopeptidase polypeptides by recombinant methods.

A specific object of the invention is to identify compounds that act asagonists and antagonists and modulate the expression of the novelaminopeptidase.

A further specific object of the invention is to provide compounds thatmodulate expression of the aminopeptidase for treatment and diagnosis ofaminopeptidase-related disorders.

The invention is thus based on the identification of a novel humanaminopeptidase. The amino acid sequence is shown in SEQ ID NO 1. Thenucleotide sequence is shown as SEQ ID NO 2.

The invention provides isolated arninopeptidase polypeptides, includinga polypeptide having the amino acid sequence shown in SEQ ID NO 1 or theamino acid sequence encoded by the cDNA deposited as ATCC Patent DepositNo. PTA-1642 on Apr. 5, 2000 (“the deposited cDNA”).

The invention also provides isolated aminopeptidase nucleic acidmolecules having the sequence shown in SEQ ID NO 2 or in the depositedcDNA.

The invention also provides variant polypeptides having an amino acidsequence that is substantially homologous to the amino acid sequenceshown in SEQ ID NO 1 or encoded by the deposited cDNA.

The invention also provides variant nucleic acid sequences that aresubstantially homologous to the nucleotide sequence shown in SEQ ID NO 2or in the deposited cDNA.

The invention also provides fragments of the polypeptide shown in SEQ IDNO 1 and nucleotide sequence shown in SEQ ID NO 2, as well assubstantially homologous fragments of the polypeptide or nucleic acid.

The invention further provides nucleic acid constructs comprising thenucleic acid molecules described herein. In a preferred embodiment, thenucleic acid molecules of the invention are operatively linked to aregulatory sequence.

The invention also provides vectors and host cells for expressing theaminopeptidase nucleic acid molecules and polypeptides, and particularlyrecombinant vectors and host cells.

The invention also provides methods of making the vectors and host cellsand methods for using them to produce the aminopeptidase nucleic acidmolecules and polypeptides.

The invention also provides antibodies or antigen-binding fragmentsthereof that selectively bind the aminopeptidase polypeptides andfragments.

The invention also provides methods of screening for compounds thatmodulate expression or activity of the aminopeptidase polypeptides ornucleic acid (RNA or DNA).

The invention also provides a process for modulating aminopeptidasepolypeptide or nucleic acid expression or activity, especially using thescreened compounds. Modulation may be used to treat conditions relatedto aberrant activity or expression of the aminopeptidase polypeptides ornucleic acids.

The invention also provides assays for determining the activity of orthe presence or absence of the arninopeptidase polypeptides or nucleicacid molecules in a biological sample, including for disease diagnosis.

The invention also provides assays for determining the presence of amutation in the polypeptides or nucleic acid molecules, including fordisease diagnosis.

In still a further embodiment, the invention provides a computerreadable means containing the nucleotide and/or amino acid sequences ofthe nucleic acids and polypeptides of the invention, respectively.

DESCRIPTION OF THE DRAWINGS

FIG. 1A-C shows the aminopeptidase nucleotide sequence (SEQ ID NO 2),the coding region (nucleotides 146-3028 of SEQ ID NO:2; nucleotides1-2883 of SEQ ID NO:3) and the deduced amino acid sequence (SEQ ID NO1).

FIG. 2 shows an analysis of the aminopeptidase amino acid sequence:ottum and coil regions; hydrophilicity; amphipathic regions; flexibleregions; antigenic index; and surface probability plot.

FIG. 3 shows a hydrophobicity plot of the aminopeptidase.

FIG. 4A-C shows an analysis of the aminopeptidase open reading frame foramino acids corresponding to specific functional sites. The protein alsocontains a zinc binding region signature found in neutral zincmetallopeptidases.

FIG. 5 shows RNA expression of the aminopeptidase in normal humantissues and in carcinomas.

FIG. 6 shows RNA expression of the aminopeptidase in human tissues andcells.

DETAILED DESCRIPTION OF THE INVENTION

Polypeptides

The invention is based on the discovery of a novel human aminopeptidase.Specifically, an expressed sequence tag (EST) was selected based onhomology to aminopeptidase sequences. This EST was used to designprimers based on sequences that it contains and used to identify a cDNAfrom endothelial cell cDNA library. Positive clones were sequenced andthe overlapping fragments were assembled. Analysis of the assembledsequence revealed that the cloned cDNA molecule encodes anaminopeptidase.

The invention thus relates to a novel aminopeptidase having the deducedamino acid sequence shown in FIG. 1 (SEQ ID NO 1) or having the aminoacid sequence encoded by the cDNA deposited in a bacterial host with thePatent Depository of the American Type Culture Collection (ATCC),Manassas, Va., on Apr. 5, 2000 and assigned Patent Deposit No. PTA-1642.

The deposit will be maintained under the terms of the Budapest Treaty onthe International Recognition of the Deposit of Microorganisms. Thedeposit is provided as a convenience to those of skill in the art and isnot an admission that a deposit is required under 35 U.S.C. § 112. Thedeposited sequence, as well as the polypeptides encoded by the sequence,is incorporated herein by reference and controls in the event of anyconflict, such as a sequencing error, with description in thisapplication.

“Aminopeptidase polypeptide” or “aminopeptidase protein” refers to thepolypeptide in SEQ ID NO 1 or encoded by the deposited cDNA. The term“aminopeptidase protein” or “aminopeptidase polypeptide”, however,further includes the numerous variants described herein, as well asfragments derived from the full-length aminopeptidase and variants.

Tissues and/or cells in which the aminopeptidase is found include, butare not limited to, the tissues shown in FIGS. 5 and 6. In addition tothese tissues, expression has also been found in colon and breastcarcinoma and in lung carcinoma, especially squamous cell carcinoma.

The present invention thus provides an isolated or purifiedaminopeptidase polypeptide and variants and fragments thereof.

Based on a BLAST search, highest homology was shown to aninsulin-regulated membrane aminopeptidase.

As used herein, a polypeptide is said to be “isolated” or “purified”when it is substantially free of cellular material when it is isolatedfrom recombinant and non-recombinant cells, or free of chemicalprecursors or other chemicals when it is chemically synthesized. Apolypeptide, however, can be joined to another polypeptide with which itis not normally associated in a cell and still be considered “isolated”or “purified.”

The aminopeptidase polypeptides can be purified to homogeneity. It isunderstood, however, that preparations in which the polypeptide is notpurified to homogeneity are useful and considered to contain an isolatedform of the polypeptide. The critical feature is that the preparationallows for the desired function of the polypeptide, even in the presenceof considerable amounts of other components. Thus, the inventionencompasses various degrees of purity.

In one embodiment, the language “substantially free of cellularmaterial” includes preparations of the aminopeptidase having less thanabout 30% (by dry weight) other proteins (i.e., contaminating protein),less than about 20% other proteins, less than about 10% other proteins,or less than about 5% other proteins. When the polypeptide isrecombinantly produced, it can also be substantially free of culturemedium, i.e., culture medium represents less than about 20%, less thanabout 10%, or less than about 5% of the volume of the proteinpreparation.

An aminopeptidase polypeptide is also considered to be isolated when itis part of a membrane preparation or is purified and then reconstitutedwith membrane vesicles or liposomes.

The language “substantially free of chemical precursors or otherchemicals” includes preparations of the aminopeptidase polypeptide inwhich it is separated from chemical precursors or other chemicals thatare involved in its synthesis. In one embodiment, the language“substantially free of chemical precursors or other chemicals” includespreparations of the polypeptide having less than about 30% (by dryweight) chemical precursors or other chemicals, less than about 20%chemical precursors or other chemicals, less than about 10% chemicalprecursors or other chemicals, or less than about 5% chemical precursorsor other chemicals.

In one embodiment, the aminopeptidase polypeptide comprises the aminoacid sequence shown in SEQ ID NO 1. However, the invention alsoencompasses sequence variants. Variants include a substantiallyhomologous protein encoded by the same genetic locus in an organism,i.e., an allelic variant. Variants also encompass proteins derived fromother genetic loci in an organism, but having substantial homology tothe aminopeptidase of SEQ ID NO 1. Variants also include proteinssubstantially homologous to the aminopeptidase but derived from anotherorganism, i.e., an ortholog. Variants also include proteins that aresubstantially homologous to the aminopeptidase that are produced bychemical synthesis. Variants also include proteins that aresubstantially homologous to the aminopeptidase that are produced byrecombinant methods. It is understood, however, that variants excludeany amino acid sequences disclosed prior to the invention.

As used herein, two proteins (or a region of the proteins) aresubstantially homologous when the amino acid sequences are at leastabout 60-65%, 65-70%, 70-75%, typically at least about 80-85%, and mosttypically at least about 90-95% or more homologous. A substantiallyhomologous amino acid sequence, according to the present invention, willbe encoded by a nucleic acid sequence hybridizing to the nucleic acidsequence, or portion thereof, of the sequence shown in SEQ ID NO 2 understringent conditions as more fully described below.

To determine the percent identity of two amino acid sequences or of twonucleic acid sequences, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in one or both of a first and asecond amino acid or nucleic acid sequence for optimal alignment andnon-homologous sequences can be disregarded for comparison purposes). Ina preferred embodiment, the length of a reference sequence aligned forcomparison purposes is at least 30%, preferably at least 40%, morepreferably at least 50%, even more preferably at least 60%, and evenmore preferably at least 70%, 80%, or 90% of the length of the referencesequence (e.g., when aligning a second sequence to the amino acidsequences herein having 502 amino acid residues, at least 165,preferably at least 200, more preferably at least 250, even morepreferably at least 300, and even more preferably at least 350, 400,450, and 500 amino acid residues are aligned). The amino acid residuesor nucleotides at corresponding amino acid positions or nucleotidepositions are then compared. When a position in the first sequence isoccupied by the same amino acid residue or nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position (as used herein amino acid or nucleic acid“identity” is equivalent to amino acid or nucleic acid “homology”). Thepercent identity between the two sequences is a function of the numberof identical positions shared by the sequences, taking into account thenumber of gaps, and the length of each gap, which need to be introducedfor optimal alignment of the two sequences.

The invention also encompasses polypeptides having a lower degree ofidentity but having sufficient similarity so as to perform one or moreof the same functions performed by the aminopeptidase. Similarity isdetermined by conserved amino acid substitution. Such substitutions arethose that substitute a given amino acid in a polypeptide by anotheramino acid of like characteristics. Conservative substitutions arelikely to be phenotypically silent. Typically seen as conservativesubstitutions are the replacements, one for another, among the aliphaticamino acids Ala, Val, Leu, and Ile; interchange of the hydroxyl residuesSer and Thr, exchange of the acidic residues Asp and Glu, substitutionbetween the amide residues Asn and Gln, exchange of the basic residuesLys and Arg and replacements among the aromatic residues Phe, Tyr.Guidance concerning which amino acid changes are likely to bephenotypically silent are found in Bowie et al., Science 247:1306-1310(1990). TABLE 1 Conservative Amino Acid Substitutions. AromaticPhenylalanine Tryptophan Tyrosine Hydrophobic Leucine Isoleucine ValinePolar Glutamine Asparagine Basic Arginine Lysine Histidine AcidicAspartic Acid Glutamic Acid Small Alanine Serine Threonine MethionineGlycine

The comparison of sequences and determination of percent identity andsimilarity between two sequences can be accomplished using amathematical algorithm. (Computational Molecular Biology, Lesk, A. M.,ed., Oxford University Press, New York, 1988; Biocomputing: Informaticsand Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993;Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin,H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis inMolecular Biology, von Heinje, G., Academic Press, 1987; and SequenceAnalysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press,New York, 1991).

A preferred, non-limiting example of such a mathematical algorithm isdescribed in Karlin et al. (1993) Proc. Natl. Acad. Sci. USA90:5873-5877. Such an algorithm is incorporated into the NBLAST andXBLAST programs (version 2.0) as described in Altschul et al. (1997)Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLASTprograms, the default parameters of the respective programs (e.g.,NBLAST) can be used. See www.ncbi.nlm.nih.gov. In one embodiment,parameters for sequence comparison can be set at score=100,wordlength=12, or can be varied (e.g., W=5or W=20).

In a preferred embodiment, the percent identity between two amino acidsequences is determined using the Needleman et al. (1970) (J. Mol. Biol.48:444-453) algorithm which has been incorporated into the GAP programin the GCG software package (available at www.gcg.com), using either aBLOSUM 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10,8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet anotherpreferred embodiment, the percent identity between two nucleotidesequences is determined using the GAP program in the GCG softwarepackage (Devereux et al. (1984) Nucleic Acids Res. 12(1):387) (availableat www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40,50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6.

Another preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS (1989). Such an algorithm is incorporated into the ALIGNprogram (version 2.0) which is part of the CGC sequence alignmentsoftware package. When utilizing the ALIGN program for comparing aminoacid sequences, a PAM120 weight residue table, a gap length penalty of12, and a gap penalty of 4 can be used. Additional algorithms forsequence analysis are known in the art and include ADVANCE and ADAM asdescribed in Torellis et al. (1994) Comput. Appl. Biosci. 10:3-5; andFASTA described in Pearson et al. (1988) PNAS 85:2444-8.

A variant polypeptide can differ in amino acid sequence by one or moresubstitutions, deletions, insertions, inversions, fusions, andtruncations or a combination of any of these.

Variant polypeptides can be fully functional or can lack function in oneor more activities. Thus, in the present case, variations can affect thefunction, for example, of one or more of the regions corresponding tothe catalytic region, regulatory regions, substrate binding regions,zinc binding regions, regions involved in membrane association, andregions involved in enzyme modification, for example, byphosphorylation.

Fully functional variants typically contain only conservative variationor variation in non-critical residues or in non-critical regions.Functional variants can also contain substitution of similar aminoacids, which results in no change or an insignificant change infunction. Alternatively, such substitutions may positively or negativelyaffect function to some degree.

Non-functional variants typically contain one or more non-conservativeamino acid substitutions, deletions, insertions, inversions, ortruncation or a substitution, insertion, inversion, or deletion in acritical residue or critical region.

As indicated, variants can be naturally-occurring or can be made byrecombinant means or chemical synthesis to provide useful and novelcharacteristics for the aminopeptidase polypeptide. This includespreventing immunogenicity from pharmaceutical formulations by preventingprotein aggregation.

Useful variations further include alteration of catalytic activity. Forexample, one embodiment involves a variation at the peptide binding sitethat results in binding but not hydrolysis of the peptide substrate. Afurther useful variation at the same site can result in altered affinityfor the peptide substrate. Useful variations also include changes thatprovide for affinity for another peptide substrate. Another usefulvariation provides a fusion protein in which one or more domains orsubregions are operationally fused to one or more domains or subregionsfrom another aminopeptidase.

Amino acids that are essential for function can be identified by methodsknown in the art, such as site-directed mutagenesis or alanine-scanningmutagenesis (Cunningham et al. (1985) Science 244:1081-1085). The latterprocedure introduces single alanine mutations at every residue in themolecule. The resulting mutant molecules are then tested for biologicalactivity, such as peptide bond hydrolysis in vitro or related biologicalactivity, such as proliferative activity. Sites that are critical forbinding can also be determined by structural analysis such ascrystallization, nuclear magnetic resonance or photoaffinity labeling(Smith et al. (1992) J. Mol. Biol. 224:899-904; de Vos et al. (1992)Science 255:306-312).

Substantial homology can be to the entire nucleic acid or amino acidsequence or to fragments of these sequences.

The invention thus also includes polypeptide fragments of theaminopeptidase. Fragments can be derived from the amino acid sequenceshown in SEQ ID NO. 1. However, the invention also encompasses fragmentsof the variants of the aminopeptidase as described herein.

The fragments to which the invention pertains, however, are not to beconstrued as encompassing fragments that may be disclosed prior to thepresent invention.

Accordingly, a fragment can comprise at least about 10, 15, 20, 25, 30,35, 40, 45, 50 or more contiguous amino acids. Fragments can retain oneor more of the biological activities of the protein, for example theability to bind to or hydrolyze target peptides, as well as fragmentsthat can be used as an immunogen to generate aminopeptidase antibodies.

Biologically active fragments (peptides which are, for example, 5, 7,10, 12, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acidsin length) can comprise a functional site. Such sites include but arenot limited to the catalytic site, regulatory sites, sites important forsubstrate recognition or binding, zinc binding region, the regioncontaining a metalloprotease motif (IAHELAHQW), sites containing themotif characteristic of aminopeptidases in the M1 family (GAMEN), thesite contributing to exopeptidase specificity, the peptidase domain fromabout amino acid 69 to about amino acid 458, phosphorylation sites,glycosylation sites, and other functional sites disclosed herein.

Such sites or motifs can be identified by means of routine computerizedhomology searching procedures.

Fragments, for example, can extend in one or both directions from thefunctional site to encompass 5, 10, 15, 20, 30, 40, 50, or up to 100amino acids. Further, fragments can include sub-fragments of thespecific sites or regions disclosed herein, which sub-fragments retainthe function of the site or region from which they are derived.

These regions can be identified by well-known methods involvingcomputerized homology analysis.

The invention also provides fragments with immunogenic properties. Thesecontain an epitope-bearing portion of the aminopeptidase and variants.These epitope-bearing peptides are useful to raise antibodies that bindspecifically to an aminopeptidase polypeptide or region or fragment.These peptides can contain at least 10, 12, at least 14, or between atleast about 15 to about 30 amino acids.

Non-limiting examples of antigenic polypeptides that can be used togenerate antibodies include but are not limited to peptides derived fromextracellular regions. Regions having a high antigenicity index areshown in FIG. 2. However, intracellularly-made antibodies(“intrabodies”) are also encompassed, which would recognizeintracellular peptide regions.

The epitope-bearing aminopeptidase polypeptides may be produced by anyconventional means (Houghten, R. A. (1985) Proc. Natl. Acad. Sci. USA82:5131-5135). Simultaneous multiple peptide synthesis is described inU.S. Pat. No. 4,631,211.

Fragments can be discrete (not fused to other amino acids orpolypeptides) or can be within a larger polypeptide. Further, severalfragments can be comprised within a single larger polypeptide. In oneembodiment a fragment designed for expression in a host can haveheterologous pre- and pro-polypeptide regions fused to the aminoterminus of the aminopeptidase fragment and an additional region fusedto the carboxyl terminus of the fragment.

The invention thus provides chimeric or fusion proteins. These comprisean aminopeptidase peptide sequence operatively linked to a heterologouspeptide having an amino acid sequence not substantially homologous tothe aminopeptidase. “Operatively linked” indicates that theaminopeptidase peptide and the heterologous peptide are fused in-frame.The heterologous peptide can be fused to the N-terminus or C-terminus ofthe aminopeptidase or can be internally located.

In one embodiment the fusion protein does not affect aminopeptidasefunction per se. For example, the fusion protein can be a GST-fusionprotein in which the aminopeptidase sequences are fused to theC-terminus of the GST sequences. Other types of fusion proteins include,but are not limited to, enzymatic fusion proteins, for examplebeta-galactosidase fusions, yeast two-hybrid GALA fusions, poly-Hisfusions and Ig fusions. Such fusion proteins, particularly poly-Hisfusions, can facilitate the purification of recombinant aminopeptidase.In certain host cells (e.g., mammalian host cells), expression and/orsecretion of a protein can be increased by using a heterologous signalsequence. Therefore, in another embodiment, the fusion protein containsa heterologous signal sequence at its N-terminus.

EP-A-O 464 533 discloses fusion proteins comprising various portions ofimmunoglobulin constant regions. The Fc is useful in therapy anddiagnosis and thus results, for example, in improved pharmacokineticproperties (EP-A 0232 262). In drug discovery, for example, humanproteins have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists (Bennett et al.(1995) J. Mol. Recog. 8:52-58 (1995) and Johanson et al. J. Biol. Chem.270:9459-9471). Thus, this invention also encompasses soluble fusionproteins containing an aminopeptidase polypeptide and various portionsof the constant regions of heavy or light chains of immunoglobulins ofvarious subclass (IgG, IgM, IgA, IgE). Preferred as immunoglobulin isthe constant part of the heavy chain of human IgG, particularly IgG1,where fusion takes place at the hinge region. For some uses it isdesirable to remove the Fc after the fusion protein has been used forits intended purpose, for example when the fusion protein is to be usedas antigen for immunizations. In a particular embodiment, the Fc partcan be removed in a simple way by a cleavage sequence, which is alsoincorporated and can be cleaved with factor Xa.

A chimeric or fusion protein can be produced by standard recombinant DNAtechniques. For example, DNA fragments coding for the different proteinsequences are ligated together in-frame in accordance with conventionaltechniques. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers which give rise to complementary overhangs betweentwo consecutive gene fragments which can subsequently be annealed andre-amplified to generate a chimeric gene sequence (see Ausubel et al.(1992) Current Protocols in Molecular Biology). Moreover, manyexpression vectors are commercially available that already encode afusion moiety (e.g., a GST protein). An aminopeptidase-encoding nucleicacid can be cloned into such an expression vector such that the fusionmoiety is linked in-frame to the aminopeptidase.

Another form of fusion protein is one that directly affectsaminopeptidase functions. Accordingly, an aminopeptidase polypeptide isencompassed by the present invention in which one or more of theaminopeptidase regions (or parts thereof) has been replaced byhomologous regions (or parts thereof) from another aminopeptidase.Accordingly, various permutations are possible. Thus, chimericaminopeptidases can be formed in which one or more of the native domainsor subregions has been replaced by another.

Additionally, chimeric aminopeptidase proteins can be produced in whichone or more functional sites is derived from a different aminopeptidase.It is understood however that sites could be derived fromaminopeptidases that occur in the mammalian genome but which have notyet been discovered or characterized.

The isolated aminopeptidase protein can be purified from cells thatnaturally express it, such as from any of those tissues shown in FIGS. 5and 6, especially purified from cells that have been altered to expressit (recombinant), or synthesized using known protein synthesis methods.

In one embodiment, the protein is produced by recombinant DNAtechniques. For example, a nucleic acid molecule encoding theaminopeptidase polypeptide is cloned into an expression vector, theexpression vector introduced into a host cell and the protein expressedin the host cell. The protein can then be isolated from the cells by anappropriate purification scheme using standard protein purificationtechniques. Polypeptides often contain amino acids other than the 20amino acids commonly referred to as the 20 naturally-occurring aminoacids. Further, many amino acids, including the terminal amino acids,may be modified by natural processes, such as processing and otherpost-translational modifications, or by chemical modification techniqueswell known in the art. Common modifications that occur naturally inpolypeptides are described in basic texts, detailed monographs, and theresearch literature, and they are well known to those of skill in theart.

Accordingly, the polypeptides also encompass derivatives or analogs inwhich a substituted amino acid residue is not one encoded by the geneticcode, in which a substituent group is included, in which the maturepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or in which the additional amino acids are fused to the maturepolypeptide, such as a leader or secretory sequence or a sequence forpurification of the mature polypeptide or a pro-protein sequence.

Known modifications include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of anucleotide or nucleotide derivative, covalent attachment of a lipid orlipid derivative, covalent attachment of phosphatidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent crosslinks, formation of cystine, formation ofpyroglutamate, formylation, gamma carboxylation, glycosylation, GPIanchor formation, hydroxylation, iodination, methylation,myristoylation, oxidation, proteolytic processing, phosphorylation,prenylation, racemization, selenoylation, sulfation, transfer-RNAmediated addition of amino acids to proteins such as arginylation, andubiquitination.

Such modifications are well-known to those of skill in the art and havebeen described in great detail in the scientific literature. Severalparticularly common modifications, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation, for instance, are described in most basic texts,such as Proteins-Structure and Molecular Properties, 2nd ed., T.E.Creighton, W. H. Freeman and Company, New York (1993). Many detailedreviews are available on this subject, such as by Wold, F.,Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed.,Academic Press, New York 1-12 (1983); Seifter et al. (1990) Meth.Enzymol. 182: 626-646) and Rattan et al. (1992) Ann. N.Y Acad. Sci.663:48-62).

As is also well known, polypeptides are not always entirely linear. Forinstance, polypeptides may be branched as a result of ubiquitination,and they may be circular, with or without branching, generally as aresult of post-translation events, including natural processing eventsand events brought about by human manipulation which do not occurnaturally. Circular, branched and branched circular polypeptides may besynthesized by non-translational natural processes and by syntheticmethods.

Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.Blockage of the amino or carboxyl group in a polypeptide, or both, by acovalent modification, is common in naturally-occurring and syntheticpolypeptides. For instance, the aminoterminal residue of polypeptidesmade in E. coli, prior to proteolytic processing, almost invariably willbe N-formylmethionine.

The modifications can be a function of how the protein is made. Forrecombinant polypeptides, for example, the modifications will bedetermined by the host cell posttranslational modification capacity andthe modification signals in the polypeptide amino acid sequence.Accordingly, when glycosylation is desired, a polypeptide should beexpressed in a glycosylating host, generally a eukaryotic cell. Insectcells often carry out the same posttranslational glycosylations asmammalian cells and, for this reason, insect cell expression systemshave been developed to efficiently express mammalian proteins havingnative patterns of glycosylation. Similar considerations apply to othermodifications.

The same type of modification may be present in the same or varyingdegree at several sites in a given polypeptide. Also, a givenpolypeptide may contain more than one type of modification.

Polypeptide Uses

The protein sequences of the present invention can be used as a “querysequence” to perform a search against public databases to, for example,identify other family members or related sequences. Such searches can beperformed using the NBLAST and XBLAST programs (version 2.0) of Altschulet al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can beperformed with the NBLAST program, score=100, wordlength=12 to obtainnucleotide sequences homologous to the nucleic acid molecules of theinvention. BLAST protein searches can be performed with the XBLASTprogram, score=50, wordlength=3 to obtain amino acid sequenceshomologous to the proteins of the invention. To obtain gapped alignmentsfor comparison purposes, Gapped BLAST can be utilized as described inAltschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. Whenutilizing BLAST and Gapped BLAST programs, the default parameters of therespective programs (e.g., XBLAST and NBLAST) can be used. Seewww.ncbi.nlm.nih.gov.

The aminopeptidase polypeptides are useful for producing antibodiesspecific for the aminopeptidase, regions, or fragments. Regions having ahigh antigenicity index score are shown in FIG. 2.

The aminopeptidase polypeptides are useful for biological assays relatedto aminopeptidases. Such assays involve any of the known aminopeptidasefunctions or activities or properties useful for diagnosis and treatmentof aminopeptidase-related conditions.

The aminopeptidase polypeptides are also useful in drug screeningassays, in cell-based or cell-free systems. Cell-based systems can benative, i.e., cells that normally express the aminopeptidase, as abiopsy or expanded in cell culture. In one embodiment, however,cell-based assays involve recombinant host cells expressing theaminopeptidase.

Determrining the ability of the test compound to interact with theaminopeptidase can also comprise determining the ability of the testcompound to preferentially bind to the polypeptide as compared to theability of a known binding molecule to bind to the polypeptide.

The polypeptides can be used to identify compounds that modulateaminopeptidase activity. Such compounds, for example, can increase ordecrease affinity or rate of binding to peptide substrate, compete withpeptide substrate for binding to the aminopeptidase, or displace peptidesubstrate bound to the aminopeptidase. Both aminopeptidase andappropriate variants and fragments can be used in high-throughputscreens to assay candidate compounds for the ability to bind to theaminopeptidase. These compounds can be further screened against afunctional aminopeptidase to determine the effect of the compound on theaminopeptidase activity. Compounds can be identified that activate(agonist) or inactivate (antagonist) the aminopeptidase to a desireddegree. Modulatory methods can be performed in vitro (e.g., by culturingthe cell with the agent) or, alternatively, in vivo (e.g., byadministering the agent to a subject.

The aminopeptidase polypeptides can be used to screen a compound for theability to stimulate or inhibit interaction between the aminopeptidaseprotein and a target molecule that normally interacts with theaminopeptidase protein, for example, substrate-peptide or zinccomponent. The assay includes the steps of combining the aminopeptidaseprotein with a candidate compound under conditions that allow theaminopeptidase protein or fragment to interact with the target molecule,and to detect the formation of a complex between the aminopeptidaseprotein and the target or to detect the biochemical consequence of theinteraction with the aminopeptidase and the target.

Determining the ability of the aminopeptidase to bind to a targetmolecule can also be accomplished using a technology such as real-timeBimolecular Interaction Analysis (BIA). Sjolander et al. (1991) Anal.Chem. 63:2338-2345 and Szabo et al (1995) Curr. Opin. Struct. Biol.5:699-705. As used herein, “BIA” is a technology for studyingbiospecific interactions in real time, without labeling any of theinteractants (e.g., BIAcore™). Changes in the optical phenomenon surfaceplasmon resonance (SPR) can be used as an indication of real-timereactions between biological molecules.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including: biological libraries; spatially addressable parallelsolid phase or solution phase libraries; synthetic library methodsrequiring deconvolution; the ‘one-bead one-compound’ library method; andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to polypeptide libraries, whilethe other four approaches are applicable to polypeptide, non-peptideoligomer or small molecule libraries of compounds (Lam, K. S. (1997)Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andin Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries of compoundsmay be presented in solution (e.g., Houghten (1992) Biotechniques13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor(1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409),spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc.Natl. Acad. Sci. USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 97:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra).

Candidate compounds include, for example, 1) peptides such as solublepeptides, including Ig-tailed fusion peptides and members of randompeptide libraries (see, e.g., Lam et al. (1991) Nature 354:82-84;Houghten et al. (1991) Nature 354:84-86) and combinatorialchemistry-derived molecular libraries made of D- and/or L- configurationamino acids; 2) phosphopeptides (e.g., members of random and partiallydegenerate, directed phosphopeptide libraries, see, e.g., Songyang etal. (1993) Cell 72:767-778); 3) antibodies (e.g., polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, and single chainantibodies as well as Fab, F(ab′)₂, Fab expression library fragments,and epitope-binding fragments of antibodies); and 4) small organic andinorganic molecules (e.g., molecules obtained from combinatorial andnatural product libraries).

One candidate compound is a soluble full-length aminopeptidase orfragment that competes for peptide binding. Other candidate compoundsinclude mutant aminopeptidases or appropriate fragments containingmutations that affect aminopeptidase function and compete for peptidesubstrate. Accordingly, a fragment that competes for substrate, forexample with a higher affinity, or a fragment that binds substrate butdoes not degrade it, is encompassed by the invention.

The invention provides other end points to identify compounds thatmodulate (stimulate or inhibit) aminopeptidase activity. The assaystypically involve an assay of cellular events that indicateaminopeptidase activity. Thus, the expression of genes that are up- ordown-regulated in response to the aminopeptidase activity can beassayed. In one embodiment, the regulatory region of such genes can beoperably linked to a marker that is easily detectable, such asluciferase. Alternatively, modification of the aminopeptidase could alsobe measured.

Any of the biological or biochemical functions mediated by thearninopeptidase can be used as an endpoint assay. These include all ofthe biochemical or biochemical/biological events described herein, inthe references cited herein, incorporated by reference for theseendpoint assay targets, and other functions known to those of ordinaryskill in the art.

In the case of the aminopeptidase, specific end points can includepeptide bond hydrolysis.

Binding and/or activating compounds can also be screened by usingchimeric aminopeptidase proteins in which one or more regions, segments,sites, and the like, as disclosed herein, or parts thereof, can bereplaced by their heterologous counterparts derived from otherarninopeptidases. For example, a catalytic region can be used thatinteracts with a different peptide sequence specificity and/or affinitythan the native aminopeptidase. Accordingly, a different set ofcomponents is available as an end-point assay for activation. As afurther alternative, the site of modification by an effector protein,for example phosphorylation, can be replaced with the site for adifferent effector protein. Activation can also be detected by areporter gene containing an easily detectable coding region operablylinked to a transcriptional regulatory sequence that is part of thenative pathway in which the aminopeptidase is involved.

The aminopeptidase polypeptides are also useful in competition bindingassays in methods designed to discover compounds that interact with theaminopeptidase. Thus, a compound is exposed to an aminopeptidasepolypeptide under conditions that allow the compound to bind or tootherwise interact with the polypeptide. Soluble aminopeptidasepolypeptide is also added to the mixture. If the test compound interactswith the soluble aminopeptidase polypeptide, it decreases the amount ofcomplex formed or activity from the aminopeptidase target. This type ofassay is particularly useful in cases in which compounds are sought thatinteract with specific regions of the aminopeptidase. Thus, the solublepolypeptide that competes with the target aminopeptidase region isdesigned to contain peptide sequences corresponding to the region ofinterest.

Another type of competition-binding assay can be used to discovercompounds that interact with specific functional sites. As an example,bindable zinc and a candidate compound can be added to a sample of theaminopeptidase. Compounds that interact with the aminopeptidase at thesame site as the zinc will reduce the amount of complex formed betweenthe aminopeptidase and the zinc. Accordingly, it is possible to discovera compound that specifically prevents interaction between theaminopeptidase and the zinc component. Another example involves adding acandidate compound to a sample of aminopeptidase and substrate peptide.A compound that competes with the peptide will reduce the amount ofhydrolysis or binding of the peptide to the aminopeptidase. Accordingly,compounds can be discovered that directly interact with theaminopeptidase and compete with the peptide. Such assays can involve anyother component that interacts with the aminopeptidase.

To perform cell free drug screening assays, it is desirable toimmobilize either the aminopeptidase, or fragment, or its targetmolecule to facilitate separation of complexes from uncomplexed forms ofone or both of the proteins, as well as to accommodate automation of theassay.

Techniques for immobilizing proteins on matrices can be used in the drugscreening assays. In one embodiment, a fusion protein can be providedwhich adds a domain that allows the protein to be bound to a matrix. Forexample, glutathione-S-transferase/aminopeptidase fusion proteins can beadsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis,Mo.) or glutathione derivatized microtitre plates, which are thencombined with the cell lysates (e.g., ³⁵S-labeled) and the candidatecompound, and the mixture incubated under conditions conducive tocomplex formation (e.g., at physiological conditions for salt and pH).Following incubation, the beads are washed to remove any unbound label,and the matrix immobilized and radiolabel determined directly, or in thesupernatant after the complexes is dissociated. Alternatively, thecomplexes can be dissociated from the matrix, separated by SDS-PAGE, andthe level of aminopeptidase-binding protein found in the bead fractionquantitated from the gel using standard electrophoretic techniques. Forexample, either the polypeptide or its target molecule can beimmobilized utilizing conjugation of biotin and streptavidin usingtechniques well known in the art. Alternatively, antibodies reactivewith the protein but which do not interfere with binding of the proteinto its target molecule can be derivatized to the wells of the plate, andthe protein trapped in the wells by antibody conjugation. Preparationsof an aminopeptidase-binding target component, such as a peptide or zinccomponent, and a candidate compound are incubated in theaminopeptidase-presenting wells and the amount of complex trapped in thewell can be quantitated. Methods for detecting such complexes, inaddition to those described above for the GST-immobilized complexes,include immunodetection of complexes using antibodies reactive with theaminopeptidase target molecule, or which are reactive withaminopeptidase and compete with the target molecule; as well asenzyme-linked assays which rely on detecting an enzymatic activityassociated with the target molecule.

Modulators of aminopeptidase activity identified according to these drugscreening assays can be used to treat a subject with a disorder relatedto the aminopeptidase, by treating cells that express theaminopeptidase, such as any of those shown in FIGS. 5 and 6. Thesemethods of treatment include the steps of administering the modulatorsof aminopeptidase activity in a pharmaceutical composition as describedherein, to a subject in need of such treatment.

Disorders involving the lung include, but are not limited to, congenitalanomalies; atelectasis; diseases of vascular origin, such as pulmonarycongestion and edema, including hemodynamic pulmonary edema and edemacaused by microvascular injury, adult respiratory distress syndrome(diffuse alveolar damage), pulmonary embolism, hemorrhage, andinfarction, and pulmonary hypertension and vascular sclerosis; chronicobstructive pulmonary disease, such as emphysema, chronic bronchitis,bronchial asthma, and bronchiectasis; diffuse interstitial(infiltrative, restrictive) diseases, such as pneumoconioses,sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitialpneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia(pulmonary infiltration with eosinophilia), Bronchiolitisobliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes,including Goodpasture syndrome, idiopathic pulmonary hemosiderosis andother hemorrhagic syndromes, pulmonary involvement in collagen vasculardisorders, and pulmonary alveolar proteinosis; complications oftherapies, such as drug-induced lung disease, radiation-induced lungdisease, and lung transplantation; tumors, such as bronchogeniccarcinoma, including paraneoplastic syndromes, bronchioloalveolarcarcinoma, neuroendocrine tumors, such as bronchial carcinoid,miscellaneous tumors, and metastatic tumors; pathologies of the pleura,including inflammatory pleural effusions, noninflammatory pleuraleffusions, pneumothorax, and pleural tumors, including solitary fibroustumors (pleural fibroma) and malignant mesothelioma.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders in which aminopeptidase expression is especially relevantinclude, but are not limited to, breast and colon carcinoma, lungcarcinoma, especially squamous cell carcinoma, and insulin relateddisorders such as diabetes.

The aminopeptidase is overexpressed in both lung, breast, and coloncancer. As such, the gene is particularly relevant for the treatment ofthese disorders, where inhibiting expression of the gene could affecttumor development and/or progression.

The aminopeptidase is also expressed in the tissues shown in FIGS. 5 and6, and as such is specifically involved in disorders relating to thesetissues.

The aminopeptidase polypeptides are thus useful for treating anaminopeptidase-associated disorder characterized by aberrant expressionor activity of an aminopeptidase. In one embodiment, the method involvesadministering an agent (e.g., an agent identified by a screening assaydescribed herein), or combination of agents that modulates (e.g.,upregulates or downregulates) expression or activity of the protein. Inanother embodiment, the method involves administering the aminopeptidaseas therapy to compensate for reduced or aberrant expression or activityof the protein.

Methods for treatment include but are not limited to the use of solubleaminopeptidase or fragments of the aminopeptidase protein that competefor substrate or any other component that directly interacts with theaminopeptidase, such as zinc or any of the enzymes that modify theaminopeptidase. These aminopeptidases or fragments can have a higheraffinity for the target so as to provide effective competition.

Stimulation of activity is desirable in situations in which the proteinis abnormally downregulated and/or in which increased activity is likelyto have a beneficial effect. Likewise, inhibition of activity isdesirable in situations in which the protein is abnormally upregulatedand/or in which decreased activity is likely to have a beneficialeffect. In one example of such a situation, a subject has a disordercharacterized by aberrant development or cellular differentiation. Inanother example, the subject has a proliferative disease (e.g., cancer)or a disorder characterized by an aberrant hematopoietic response. Inanother example, it is desirable to achieve tissue regeneration in asubject (e.g., where a subject has undergone brain or spinal cord injuryand it is desirable to regenerate neuronal tissue in a regulatedmanner).

In yet another aspect of the invention, the proteins of the inventioncan be used as “bait proteins” in a two-hybrid assay or three-hybridassay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and Brent WO 94/10300), to identify other proteins(captured proteins) which bind to or interact with the proteins of theinvention and modulate their activity.

The aminopeptidase polypeptides also are useful to provide a target fordiagnosing a disease or predisposition to disease mediated by theaminopeptidase, including, but not limited to, those diseases discussedherein, and particularly lung, breast, and colon carcinoma andinsulin-related disorders, such as diabetes. Targets are useful fordiagnosing a disease or predisposition to disease mediated by theaminopeptidase, in the tissues shown in FIGS. 5 and 6. Accordingly,methods are provided for detecting the presence, or levels of, theaminopeptidase in a cell, tissue, or organism. The method involvescontacting a biological sample with a compound capable of interactingwith the aminopeptidase such that the interaction can be detected.

One agent for detecting aminopeptidase is an antibody capable ofselectively binding to aminopeptidase. A biological sample includestissues, cells and biological fluids isolated from a subject, as well astissues, cells and fluids present within a subject.

The aminopeptidase also provides a target for diagnosing active disease,or predisposition to disease, in a patient having a variantaminopeptidase. Thus, aminopeptidase can be isolated from a biologicalsample and assayed for the presence of a genetic mutation that resultsin an aberrant protein. This includes amino acid substitution, deletion,insertion, rearrangement, (as the result of aberrant splicing events),and inappropriate post-translational modification. Analytic methodsinclude altered electrophoretic mobility, altered tryptic peptidedigest, altered aminopeptidase activity in cell-based or cell-freeassay, alteration in peptide binding or degradation, zinc binding orantibody-binding pattern, altered isoelectric point, direct amino acidsequencing, and any other of the known assay techniques useful fordetecting mutations in a protein in general or in an aminopeptidasespecifically.

In vitro techniques for detection of aminopeptidase include enzymelinked immunosorbent assays (ELISAs), Western blots,immunoprecipitations and immunofluorescence. Alternatively, the proteincan be detected in vivo in a subject by introducing into the subject alabeled anti-aminopeptidase antibody. For example, the antibody can belabeled with a radioactive marker whose presence and location in asubject can be detected by standard imaging techniques. Particularlyuseful are methods, which detect the allelic variant of theaminopeptidase expressed in a subject, and methods, which detectfragments of the aminopeptidase in a sample.

The aminopeptidase polypeptides are also useful in pharmacogenomicanalysis. Pharmacogenomics deal with clinically significant hereditaryvariations in the response to drugs due to altered drug disposition andabnormal action in affected persons. See, e.g., Eichelbaum, M. (1996)Clin. Exp. Pharmacol. Physiol. 23(10-11):983-985, and Linder, M. W.(1997) Clin. Chem. 43(2):254-266. The clinical outcomes of thesevariations result in severe toxicity of therapeutic drugs in certainindividuals or therapeutic failure of drugs in certain individuals as aresult of individual variation in metabolism. Thus, the genotype of theindividual can determine the way a therapeutic compound acts on the bodyor the way the body metabolizes the compound. Further, the activity ofdrug metabolizing enzymes affects both the intensity and duration ofdrug action. Thus, the pharmacogenomics of the individual permit theselection of effective compounds and effective dosages of such compoundsfor prophylactic or therapeutic treatment based on the individual'sgenotype. The discovery of genetic polymorphisms in some drugmetabolizing enzymes has explained why some patients do not obtain theexpected drug effects, show an exaggerated drug effect, or experienceserious toxicity from standard drug dosages. Polymorphisms can beexpressed in the phenotype of the extensive metabolizer and thephenotype of the poor metabolizer. Accordingly, genetic polymorphism maylead to allelic protein variants of the aminopeptidase in which one ormore of the aminopeptidase functions in one population is different fromthose in another population. The polypeptides thus allow a target toascertain a genetic predisposition that can affect treatment modality.Thus, in a peptide-based treatment, polymorphism may give rise tocatalytic regions that are more or less active. Accordingly, dosagewould necessarily be modified to maximize the therapeutic effect withina given population containing the polymorphism. As an alternative togenotyping, specific polymorphic polypeptides could be identified.

The aminopeptidase polypeptides are also useful for monitoringtherapeutic effects during clinical trials and other treatment. Thus,the therapeutic effectiveness of an agent that is designed to increaseor decrease gene expression, protein levels or aminopeptidase activitycan be monitored over the course of treatment using the aminopeptidasepolypeptides as an end-point target. The monitoring can be, for example,as follows: (i) obtaining a pre-administration sample from a subjectprior to administration of the agent; (ii) detecting the level ofexpression or activity of the protein in the pre-administration sample;(iii) obtaining one. or more post-administration samples from thesubject; (iv) detecting the level of expression or activity of theprotein in the post-administration samples; (v) comparing the level ofexpression or activity of the protein in the pre-administration samplewith the protein in the post-administration sample or samples; and (vi)increasing or decreasing the administration of the agent to the subjectaccordingly.

Antibodies

The invention also provides antibodies that selectively bind to theaminopeptidase and its variants and fragments. An antibody is consideredto selectively bind, even if it also binds to other proteins that arenot substantially homologous with the aminopeptidase. These otherproteins share homology with a fragment or domain of the aminopeptidase.This conservation in specific regions gives rise to antibodies that bindto both proteins by virtue of the homologous sequence. In this case, itwould be understood that antibody binding to the aminopeptidase is stillselective.

To generate antibodies, an isolated aminopeptidase polypeptide is usedas an immunogen to generate antibodies using standard techniques forpolyclonal and monoclonal antibody preparation. Either the full-lengthprotein or antigenic peptide fragment can be used. Regions having a highantigenicity index are shown in FIG. 2.

Antibodies are preferably prepared from these regions or from discretefragments in these regions. However, antibodies can be prepared from anyregion of the peptide as described herein. A preferred fragment producesan antibody that diminishes or completely prevents peptide hydrolysis orbinding. Antibodies can be developed against the entire aminopeptidaseor domains of the aminopeptidase as described herein, for example, thezinc binding region, matalloprotease motif (IAHELAHQW), the GAMEN motif,sites contributing to exopeptidase specificity, and the peptidase domainor subregions thereof. Antibodies can also be developed against specificfunctional sites as disclosed herein.

The antigenic peptide can comprise a contiguous sequence of at least 12,14, 15, or 30 amino acid residues. In one embodiment, fragmentscorrespond to regions that are located on the surface of the protein,e.g., hydrophilic regions. These fragments are not to be construed,however, as encompassing any fragments, which may be disclosed prior tothe invention.

Antibodies can be polyclonal or monoclonal. An intact antibody, or afragment thereof (e.g. Fab or F(ab′)₂) can be used.

Detection can be facilitated by coupling (i.e., physically linking) theantibody to a detectable substance. Examples of detectable substancesinclude various enzymes, prosthetic groups, fluorescent materials,luminescent materials, bioluminescent materials, and radioactivematerials. Examples of suitable enzymes include horseradish peroxidase,alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examplesof suitable prosthetic group complexes include streptavidin/biotin andavidinibiotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

An appropriate immunogenic preparation can be derived from native,recombinantly expressed, or chemically synthesized peptides.

Antibody Uses

The antibodies can be used to isolate an aminopeptidase by standardtechniques, such as affinity chromatography or immunoprecipitation. Theantibodies can facilitate the purification of the natural aminopeptidasefrom cells and recombinantly produced aminopeptidase expressed in hostcells.

The antibodies are useful to detect the presence of aminopeptidase incells or tissues to determine the pattern of expression of theaminopeptidase among various tissues in an organism and over the courseof normal development.

The antibodies can be used to detect aminopeptidase in situ, in vitro,or in a cell lysate or supernatant in order to evaluate the abundanceand pattern of expression.

The antibodies can be used to assess abnormal tissue distribution orabnormal expression during development.

Antibody detection of circulating fragments of the full lengthaminopeptidase can be used to identify aminopeptidase turnover.

Further, the antibodies can be used to assess aminopeptidase expressionin disease states such as in active stages of the disease or in anindividual with a predisposition toward disease related toaminopeptidase function. When a disorder is caused by an inappropriatetissue distribution, developmental expression, or level of expression ofthe aminopeptidase protein, the antibody can be prepared against thenormal aminopeptidase protein. If a disorder is characterized by aspecific mutation in the aminopeptidase, antibodies specific for thismutant protein can be used to assay for the presence of the specificmutant aminopeptidase. However, intracellularly-made antibodies(“intrabodies”) are also encompassed, which would recognizeintracellular aminopeptidase peptide regions.

The antibodies can also be used to assess normal and aberrantsubcellular localization of cells in the various tissues in an organism.Antibodies can be developed against the whole aminopeptidase or portionsof the aminopeptidase, for example, portions of the peptidase domainfrom amino acid 69-458, including substrate recognition site.

The diagnostic uses can be applied, not only in genetic testing, butalso in monitoring a treatment modality. Accordingly, where treatment isultimately aimed at correcting aminopeptidase expression level or thepresence of aberrant aminopeptidases and aberrant tissue distribution ordevelopmental expression, antibodies directed against the aminopeptidaseor relevant fragments can be used to monitor therapeutic efficacy.

Antibodies accordingly can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen.

Additionally, antibodies are useful in pharmacogenomic analysis. Thus,antibodies prepared against polymorphic arninopeptidase can be used toidentify individuals that require modified treatment modalities.

The antibodies are also useful as diagnostic tools as an immunologicalmarker for aberrant aminopeptidase analyzed by electrophoretic mobility,isoelectric point, tryptic peptide digest, and other physical assaysknown to those in the art.

The antibodies are also useful for tissue typing. Thus, where a specificaminopeptidase has been correlated with expression in a specific tissue,antibodies that are specific for this aminopeptidase can be used toidentify a tissue type.

The antibodies are also useful in forensic identification. Accordingly,where an individual has been correlated with a specific geneticpolymorphism resulting in a specific polymorphic protein, an antibodyspecific for the polymorphic protein can be used as an aid inidentification.

The antibodies are also useful for inhibiting aminopeptidase function,for example, zinc binding, and peptide binding and/or hydrolysis.

These uses can also be applied in a therapeutic context in whichtreatment involves inhibiting aminopeptidase function. An antibody canbe used, for example, to block peptide binding. Antibodies can beprepared against specific fragments containing sites required forfunction or against intact aminopeptidase associated with a cell.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. For an overview of this technology forproducing human antibodies, see Lonberg et al. (1995) Int. Rev. Immunol.13:65-93. For a detailed discussion of this technology for producinghuman antibodies and human monoclonal antibodies and protocols forproducing such antibodies, e.g., U.S. Pat. No. 5,625,126; U.S. Pat. No.5,633,425; U.S. Pat. No. 5,569,825; U.S. Pat. No. 5,661,016; and U.S.Pat. No. 5,545,806.

The invention also encompasses kits for using antibodies to detect thepresence of an aminopeptidase protein in a biological sample. The kitcan comprise antibodies such as a labeled or labelable antibody and acompound or agent for detecting aminopeptidase in a biological sample;means for determining the amount of aminopeptidase in the sample; andmeans for comparing the amount of aminopeptidase in the sample with astandard. The compound or agent can be packaged in a suitable container.The kit can further comprise instructions for using the kit to detectaminopeptidase.

Polynucleotides

The nucleotide sequence in SEQ ID NO 2 was obtained by sequencing thedeposited human cDNA. Accordingly, the sequence of the deposited cloneis controlling as to any discrepancies between the two and any referenceto the sequence of SEQ ID NO 2 includes reference to the sequence of thedeposited cDNA.

The specifically disclosed cDNA comprises the coding region and 5′ and3′ untranslated sequences in SEQ ID NO 2.

The invention provides isolated polynucleotides encoding the novelaminopeptidase. The term “aminopeptidase polynucleotide” or“aminopeptidase nucleic acid” refers to the sequence shown in SEQ ID NO2 or in the deposited cDNA. The term “aminopeptidase polynucleotide” or“aminopeptidase nucleic acid” further includes variants and fragments ofthe amninopeptidase polynucleotides.

An “isolated” aminopeptidase nucleic acid is one that is separated fromother nucleic acid present in the natural source of the aminopeptidasenucleic acid. Preferably, an “isolated” nucleic acid is free ofsequences which naturally flank the aminopeptidase nucleic acid (i.e.,sequences located at the 5′ and 3′ ends of the nucleic acid) in thegenomic DNA of the organism from which the nucleic acid is derived.However, there can be some flanking nucleotide sequences, for example upto about 5 KB. The important point is that the aminopeptidase nucleicacid is isolated from flanking sequences such that it can be subjectedto the specific manipulations described herein, such as recombinantexpression, preparation of probes and primers, and other uses specificto the aminopeptidase nucleic acid sequences.

Moreover, an “isolated” nucleic acid molecule, such as a cDNA or RNAmolecule, can be substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. However, thenucleic acid molecule can be fused to other coding or regulatorysequences and still be considered isolated.

In some instances, the isolated material will form part of a composition(for example, a crude extract containing other substances), buffersystem or reagent mix. In other circumstances, the material may bepurified to essential homogeneity, for example as determined by PAGE orcolumn chromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

For example, recombinant DNA molecules contained in a vector areconsidered isolated. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vivo or in vitro RNA transcripts ofthe isolated DNA molecules of the present invention. Isolated nucleicacid molecules according to the present invention further include suchmolecules produced synthetically.

In some instances, the isolated material will form part of a composition(or example, a crude extract containing other substances), buffer systemor reagent mix. In other circumstances, the material may be purified toessential homogeneity, for example as determined by PAGE or columnchromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

The aminopeptidase polynucleotides can encode the mature protein plusadditional amino or carboxyterminal amino acids, or amino acids interiorto the mature polypeptide (when the mature form has more than onepolypeptide chain, for instance). Such sequences may play a role inprocessing of a protein from precursor to a mature form, facilitateprotein trafficking, prolong or shorten protein half-life or facilitatemanipulation of a protein for assay or production, among other things.As generally is the case in situ, the additional amino acids may beprocessed away from the mature protein by cellular enzymes.

The aminopeptidase polynucleotides include, but are not limited to, thesequence encoding the mature polypeptide alone, the sequence encodingthe mature polypeptide and additional coding sequences, such as a leaderor secretory sequence (e.g., a pre-pro or pro-protein sequence), thesequence encoding the mature polypeptide, with or without the additionalcoding sequences, plus additional non-coding sequences, for exampleintrons and non-coding 5′ and 3′ sequences such as transcribed butnon-translated sequences that play a role in transcription, mRNAprocessing (including splicing and polyadenylation signals), ribosomebinding and stability of mRNA. In addition, the polynucleotide may befused to a marker sequence encoding, for example, a peptide thatfacilitates purification.

Aminopeptidase polynucleotides can be in the form of RNA, such as mRNA,or in the form DNA, including cDNA and genomic DNA obtained by cloningor produced by chemical synthetic techniques or by a combinationthereof. The nucleic acid, especially DNA, can be double-stranded orsingle-stranded. Single-stranded nucleic acid can be the coding strand(sense strand) or the non-coding strand (anti-sense strand).

Aminopeptidase nucleic acid can comprise the nucleotide sequences shownin SEQ ID NO 2, corresponding to human endothelial cell cDNA

In one embodiment, the aminopeptidase nucleic acid comprises only thecoding region.

The invention further provides variant aminopeptidase polynucleotides,and fragments thereof, that differ from the nucleotide sequence shown inSEQ ID NO 2 due to degeneracy of the genetic code and thus encode thesame protein as that encoded by the nucleotide sequence shown in SEQ IDNO 2.

The invention also provides aminopeptidase nucleic acid moleculesencoding the variant polypeptides described herein. Such polynucleotidesmay be naturally occurring, such as allelic variants (same locus),homologs (different locus), and orthologs (different organism), or maybe constructed by recombinant DNA methods or by chemical synthesis. Suchnon-naturally occurring variants may be made by mutagenesis techniques,including those applied to polynucleotides, cells, or organisms.Accordingly, as discussed above, the variants can contain nucleotidesubstitutions, deletions, inversions and insertions.

Typically, variants have a substantial identity with a nucleic acidmolecules of SEQ ID NO 2 and the complements thereof. Variation canoccur in either or both the coding and non-coding regions. Thevariations can produce both conservative and non-conservative amino acidsubstitutions.

Orthologs, homologs, and allelic variants can be identified usingmethods well known in the art. These variants comprise a nucleotidesequence encoding an aminopeptidase that is typically at least about60-65%, 65-70%, 70-75%, more typically at least about 80-85%, and mosttypically at least about 90-95% or more homologous to the nucleotidesequence shown in SEQ ID NO 2 or a fragment of this sequence. Suchnucleic acid molecules can readily be identified as being able tohybridize under stringent conditions, to the nucleotide sequence shownin SEQ ID NO 2 or a fragment of the sequence. It is understood thatstringent hybridization does not indicate substantial homology where itis due to general homology, such as poly A sequences, or sequencescommon to all or most proteins, metalloproteases, all zinc bindingproteins, all proteins in the M1 family, or all aminopeptidases.Moreover, it is understood that variants do not include any of thenucleic acid sequences that may have been disclosed prior to theinvention.

As used herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences encoding a polypeptide at least 50-55%, 55%homologous to each other typically remain hybridized to each other. Theconditions can be such that sequences at least about 65%, at least about70%, at least about 75%, at least about 80%, at least about 90%, atleast about 95% or more identical to each other remain hybridized to oneanother. Such stringent conditions are known to those skilled in the artand can be found in Current Protocols in Molecular Biology, John Wiley &Sons, N.Y. (1989), 6.3.1-6.3.6, incorporated by reference. One exampleof stringent hybridization conditions are hybridization in 6×sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC, 0.1% SDS at 50-65° C. In another non-limitingexample, nucleic acid molecules are allowed to hybridize in 6×sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morelow stringency washes in 0.2×SSC/0.1% SDS at room temperature, or by oneor more moderate stringency washes in 0.2×SSC/0.1% SDS at 42° C., orwashed in 0.2×SSC/0.1% SDS at 65° C. for high stringency. In oneembodiment, an isolated nucleic acid molecule that hybridizes understringent conditions to the sequence of SEQ ID NO 2 corresponds to anaturally-occurring nucleic acid molecule. As used herein, a“naturally-occurring” nucleic acid molecule refers to an RNA or DNAmolecule having a nucleotide sequence that occurs in nature (e.g.,encodes a natural protein).

As understood by those of ordinary skill, the exact conditions can bedetermined empirically and depend on ionic strength, temperature and theconcentration of destabilizing agents such as formamide or denaturingagents such as SDS. Other factors considered in determining the desiredhybridization conditions include the length of the nucleic acidsequences, base composition, percent mismatch between the hybridizingsequences and the frequency of occurrence of subsets of the sequenceswithin other non-identical sequences. Thus, equivalent conditions can bedetermined by varying one or more of these parameters while maintaininga similar degree of identity or similarity between the two nucleic acidmolecules.

The present invention also provides isolated nucleic acids that containa single or double stranded fragment or portion that hybridizes understringent conditions to the nucleotide sequence of SEQ ID NO 2 or thecomplement of SEQ ID NO 2. In one embodiment, the nucleic acid consistsof a portion of the nucleotide sequence of SEQ ID NO 2 and thecomplement of SEQ ID NO 2. The nucleic acid fragments of the inventionare at least about 15, preferably at least about 18, 20, 23 or 25nucleotides, and can be 30, 40, 50, 100, 200, 500 or more nucleotides inlength. Longer fragments, for example, 30 or more nucleotides in length,which encode antigenic proteins or polypeptides described herein areuseful.

Furthermore, the invention provides polynucleotides that comprise afragment of the full-length aminopeptidase polynucleotides. The fragmentcan be single or double-stranded and can comprise DNA or RNA. Thefragment can be derived from either the coding or the non-codingsequence.

In another embodiment an isolated aminopeptidase nucleic acid encodesthe entire coding region. In another embodiment the isolatedaminopeptidase nucleic acid encodes a sequence corresponding to themature protein that may be from about amino acid 6 to the last aminoacid. Other fragments include nucleotide sequences encoding the aminoacid fragments described herein.

Thus, aminopeptidase nucleic acid fragments further include sequencescorresponding to the regions described herein, subregions alsodescribed, and specific functional sites. Aminopeptidase nucleic acidfragments also include combinations of the regions, segments, motifs,and other functional sites described above. A person of ordinary skillin the art would be aware of the many permutations that are possible.

Where the location of the regions or sites have been predicted bycomputer analysis, one of ordinary sill would appreciate that the aminoacid residues constituting these regions can vary depending on thecriteria used to define the regions.

However, it is understood that an aminopeptidase fragment includes anynucleic acid sequence that does not include the entire gene.

The invention also provides aminopeptidase nucleic acid fragments thatencode epitope bearing regions of the aminopeptidase proteins describedherein.

Nucleic acid fragments, according to the present invention, are not tobe construed as encompassing those fragments that may have beendisclosed prior to the invention.

Polynucleotide Uses

The nucleotide sequences of the present invention can be used as a“query sequence” to perform a search against public databases, forexample, to identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10. BLAST proteinsearches can be performed with the XBLAST program, score=50,wordlength=3 to obtain amino acid sequences homologous to the proteinsof the invention. To obtain gapped alignments for comparison purposes,Gapped BLAST can be utilized as described in Altschul et al. (1997)Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and GappedBLAST programs, the default parameters of the respective programs (e.g.,XBLAST and NBLAST) can be used. See www.ncbi.nlm.nih.gov.

The nucleic acid fragments of the invention provide probes or primers inassays such as those described below. “Probes” are oligonucleotides thathybridize in a base-specific manner to a complementary strand of nucleicacid. Such probes include polypeptide nucleic acids, as described inNielsen et al. (1991) Science 254:1497-1500. Typically, a probecomprises a region of nucleotide sequence that hybridizes under highlystringent conditions to at least about 15, typically about 20-25, andmore typically about 40, 50 or 75 consecutive nucleotides of the nucleicacid sequence shown in SEQ ID NO 2 and the complements thereof. Moretypically, the probe further comprises a label, e.g., radioisotope,fluorescent compound, enzyme, or enzyme co-factor.

As used herein, the term “primer” refers to a single-strandedoligonucleotide which acts as a point of initiation of template-directedDNA synthesis using well-known methods (e.g., PCR, LCR) including, butnot limited to those described herein. The appropriate length of theprimer depends on the particular use, but typically ranges from about 15to 30 nucleotides. The term “primer site” refers to the area of thetarget DNA to which a primer hybridizes. The term “primer pair” refersto a set of primers including a 5′ (upstream) primer that hybridizeswith the 5′ end of the nucleic acid sequence to be amplified and a 3′(downstream) primer that hybridizes with the complement of the sequenceto be amplified.

The aminopeptidase polynucleotides are thus useful for probes, primers,and in biological assays.

Where the polynucleotides are used to assess aminopeptidase propertiesor functions, such as in the assays described herein, all or less thanall of the entire cDNA can be useful. Assays specifically directed toaminopeptidase functions, such as assessing agonist or antagonistactivity, encompass the use of known fragments. Further, diagnosticmethods for assessing aminopeptidase function can also be practiced withany fragment, including those fragments that may have been known priorto the invention. Similarly, in methods involving treatment ofaminopeptidase dysfunction, all fragments are encompassed includingthose, which may have been known in the art.

The aminopeptidase polynucleotides are useful as a hybridization probefor cDNA and genomic DNA to isolate a full-length cDNA and genomicclones encoding the polypeptides described in SEQ ID NO 1 and to isolatecDNA and genomic clones that correspond to variants producing the samepolypeptides shown in SEQ ID NO 1 or the other variants describedherein. Variants can be isolated from the same tissue and organism fromwhich the polypeptides shown in SEQ ID NO 1 were isolated, differenttissues from the same organism, or from different organisms. This methodis useful for isolating genes and cDNA that aredevelopmentally-controlled and therefore may be expressed in the sametissue or different tissues at different points in the development of anorganism.

The probe can correspond to any sequence along the entire length of thegene encoding the aminopeptidase. Accordingly, it could be derived from5′ noncoding regions, the coding region, and 3′ noncoding regions.

The nucleic acid probe can be, for example, the full-length cDNA of SEQID NO 2, or a fragment thereof, such as an oligonucleotide of at least12, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient tospecifically hybridize under stringent conditions to mRNA or DNA.

Fragments of the polynucleotides described herein are also useful tosynthesize larger fragments or full-length polynucleotides describedherein. For example, a fragment can be hybridized to any portion of anmRNA and a larger or full-length cDNA can be produced.

The fragments are also useful to synthesize antisense molecules ofdesired length and sequence.

Antisense nucleic acids of the invention can be designed using thenucleotide sequences of SEQ ID NO 2, and constructed using chemicalsynthesis and enzymatic ligation reactions using procedures known in theart. For example, an antisense nucleic acid (e.g., an antisenseoligonucleotide) can be chemically synthesized using naturally occurringnucleotides or variously modified nucleotides designed to increase thebiological stability of the molecules or to increase the physicalstability of the duplex formed between the antisense and sense nucleicacids, e.g., phosphorothioate derivatives and acridine substitutednucleotides can be used. Examples of modified nucleotides which can beused to generate the antisense nucleic acid include 5-fluorouracil,5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine,4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil,5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest).

Additionally, the nucleic acid molecules of the invention can bemodified at the base moiety, sugar moiety or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids (see Hyrup etal. (1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, theterms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics,e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid phase peptide synthesis protocols as described in Hyrupet al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci.USA 93:14670. PNAs can be further modified, e.g., to enhance theirstability, specificity or cellular uptake, by attaching lipophilic orother helper groups to PNA, by the formation of PNA-DNA chimeras, or bythe use of liposomes or other techniques of drug delivery known in theart. The synthesis of PNA-DNA chimeras can be performed as described inHyrup (1996), supra, Finn et al. (1996) Nucleic Acids Res.24(17):3357-63, Mag et al. (1989) Nucleic Acids Res. 17:5973, andPeterser et al. (1975) Bioorganic Med. Chem. Lett. 5:1119.

The nucleic acid molecules and fragments of the invention can alsoinclude other appended groups such as peptides (e.g., for targeting hostcell aminopeptidases in vivo), or agents facilitating transport acrossthe cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad.Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA84:648-652; PCT Publication No. WO 88/0918) or the blood brain barrier(see, e.g., PCT Publication No. WO 89/10134). In addition,oligonucleotides can be modified with hybridization-triggered cleavageagents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) orintercalating agents (see, e.g., Zon (1988) Pharm Res. 5:539-549).

The aminopeptidase polynucleotides are also useful as primers for PCR toamplify any given region of an aminopeptidase polynucleotide.

The aminopeptidase polynucleotides are also useful for constructingrecombinant vectors. Such vectors include expression vectors thatexpress a portion of, or all of, the aminopeptidase polypeptides.Vectors also include insertion vectors, used to integrate into anotherpolynucleotide sequence, such as into the cellular genome, to alter insitu expression of aminopeptidase genes and gene products. For example,an endogenous aminopeptidase coding sequence can be replaced viahomologous recombination with all or part of the coding regioncontaining one or more specifically introduced mutations.

The aminopeptidase polynucleotides are also useful for expressingantigenic portions of the aminopeptidase proteins.

The aminopeptidase polynucleotides are also useful as probes fordetermining the chromosomal positions of the aminopeptidasepolynucleotides by means of in situ hybridization methods, such as FISH.(For a review of this technique, see Verma et al. (1988) HumanChromosomes: A Manual of Basic Techniques (Pergamon Press, New York),and PCR mapping of somatic cell hybrids. The mapping of the sequences tochromosomes is an important first step in correlating these sequenceswith genes associated with disease.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in V.McKusick, Mendelian Inheritance in Man, available on-line through JohnsHopkins University Welch Medical Library). The relationship between agene and a disease mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described in, for example, Egeland et al. ((1987)Nature 325:783-787).

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with a specified gene, can bedetermined. If a mutation is observed in some or all of the affectedindividuals but not in any unaffected individuals, then the mutation islikely to be the causative agent of the particular disease. Comparisonof affected and unaffected individuals generally involves first lookingfor structural alterations in the chromosomes, such as deletions ortranslocations, that are visible from chromosome spreads, or detectableusing PCR based on that DNA sequence. Ultimately, complete sequencing ofgenes from several individuals can be performed to confirm the presenceof a mutation and to distinguish mutations from polymorphisms.

The aminopeptidase polynucleotide probes are also useful to determinepatterns of the presence of the gene encoding the aminopeptidases andtheir variants with respect to tissue distribution, for example, whethergene duplication has occurred and whether the duplication occurs in allor only a subset of tissues. The genes can be naturally occurring or canhave been introduced into a cell, tissue, or organism exogenously.

The aminopeptidase polynucleotides are also useful for designingribozymes corresponding to all, or a part, of the mRNA produced fromgenes encoding the polynucleotides described herein.

The aminopeptidase polynucleotides are also useful for constructing hostcells expressing a part, or all, of the aminopeptidase polynucleotidesand polypeptides.

The aminopeptidase polynucleotides are also useful for constructingtransgenic animals expressing all, or a part, of the aminopeptidasepolynucleotides and polypeptides.

The aminopeptidase polynucleotides are also useful for making vectorsthat express part, or all, of the aminopeptidase polypeptides.

The aminopeptidase polynucleotides are also useful as hybridizationprobes for determining the level of aminopeptidase nucleic acidexpression. Accordingly, the probes can be used to detect the presenceof, or to determine levels of, aminopeptidase nucleic acid in cells,tissues, and in organisms. The nucleic acid whose level is determinedcan be DNA or RNA. Accordingly, probes corresponding to the polypeptidesdescribed herein can be used to assess gene copy number in a given cell,tissue, or organism. This is particularly relevant in cases in whichthere has been an amplification of the aminopeptidase genes.

Alternatively, the probe can be used in an in situ hybridization contextto assess the position of extra copies of the aminopeptidase genes, ason extrachromosomal elements or as integrated into chromosomes in whichthe aminopeptidase gene is not normally found, for example as ahomogeneously staining region.

These uses are relevant for diagnosis of disorders involving an increaseor decrease in aminopeptidase expression relative to normal, such as aproliferative disorder, a differentiative or developmental disorder, ora hematopoietic disorder.

Disorders in which the aminopeptidase expression is relevant include,but are not limited to, lung and colon carcinomas and insulin-relateddisorders, such as diabetes.

The aminopeptidase is expressed in the tissues shown in FIGS. 5 and 6.As such, the gene is particularly relevant for the treatment ofdisorders involving these tissues, especially lung, breast, and colon.

Thus, the present invention provides a method for identifying a diseaseor disorder associated with aberrant expression or activity ofaminopeptidase nucleic acid, in which a test sample is obtained from asubject and nucleic acid (e.g., mRNA, genomic DNA) is detected, whereinthe presence of the nucleic acid is diagnostic for a subject having orat risk of developing a disease or disorder associated with aberrantexpression or activity of the nucleic acid.

One aspect of the invention relates to diagnostic assays for determiningnucleic acid expression as well as activity in the context of abiological sample (e.g., blood, serum, cells, tissue) to determinewhether an individual has a disease or disorder, or is at risk ofdeveloping a disease or disorder, associated with aberrant nucleic acidexpression or activity. Such assays can be used for prognostic orpredictive purpose to thereby prophylactically treat an individual priorto the onset of a disorder characterized by or associated withexpression or activity of the nucleic acid molecules.

In vitro techniques for detection of mRNA include Northernhybridizations and in situ hybridizations. In vitro techniques fordetecting DNA includes Southern hybridizations and in situhybridization.

Probes can be used as a part of a diagnostic test kit for identifyingcells or tissues that express the aminopeptidase, such as by measuringthe level of an aminopeptidase-encoding nucleic acid in a sample ofcells from a subject e.g., mRNA or genomic DNA, or determining if theaminopeptidase gene has been mutated.

Nucleic acid expression assays are useful for drug screening to identifycompounds that modulate aminopeptidase nucleic acid expression (e.g.,antisense, polypeptides, peptidomimetics, small molecules or otherdrugs). A cell is contacted with a candidate compound and the expressionof mRNA determined. The level of expression of the mRNA in the presenceof the candidate compound is compared to the level of expression of themRNA in the absence of the candidate compound. The candidate compoundcan then be identified as a modulator of nucleic acid expression basedon this comparison and be used, for example to treat a disordercharacterized by aberrant nucleic acid expression. The modulator canbind to the nucleic acid or indirectly modulate expression, such as byinteracting with other cellular components that affect nucleic acidexpression

Modulatory methods can be performed in vitro (e.g., by culturing thecell with the agent) or, alternatively, in vivo (e.g., by administeringthe gent to a subject) in patients or in transgenic animals.

The invention thus provides a method for identifying a compound that canbe used to treat a disorder associated with nucleic acid expression ofthe aminopeptidase gene. The method typically includes assaying theability of the compound to modulate the expression of the aminopeptidasenucleic acid and thus identifying a compound that can be used to treat adisorder characterized by undesired aminopeptidase nucleic acidexpression.

The assays can be performed in cell-based and cell-free systems.Cell-based assays include cells naturally expressing the aminopeptidasenucleic acid or recombinant cells genetically engineered to expressspecific nucleic acid sequences.

Alternatively, candidate compounds can be assayed in vivo in patients orin transgenic animals.

The assay for aminopeptidase nucleic acid expression can involve directassay of nucleic acid levels, such as mRNA levels, or on collateralcompounds (such as peptide hydrolysis). Further, the expression of genesthat are up- or down-regulated in response to the aminopeptidaseactivity can also be assayed. In this embodiment the regulatory regionsof these genes can be operably linked to a reporter gene such asluciferase.

Thus, modulators of aminopeptidase gene expression can be identified ina method wherein a cell is contacted with a candidate compound and theexpression of mRNA determnined. The level of expression ofaminopeptidase mRNA in the presence of the candidate compound iscompared to the level of expression of aminopeptidase mRNA in theabsence of the candidate compound. The candidate compound can then beidentified as a modulator of nucleic acid expression based on thiscomparison and be used, for example to treat a disorder characterized byaberrant nucleic acid expression. When expression of mRNA isstatistically significantly greater in the presence of the candidatecompound than in its absence, the candidate compound is identified as astimulator of nucleic acid expression. When nucleic acid expression isstatistically significantly less in the presence of the candidatecompound than in its absence, the candidate compound is identified as aninhibitor of nucleic acid expression.

Accordingly, the invention provides methods of treatment, with thenucleic acid as a target, using a compound identified through drugscreening as a gene modulator to modulate aminopeptidase nucleic acidexpression. Modulation includes both up-regulation (i.e. activation oragonization) or down-regulation (suppression or antagonization) oreffects on nucleic acid activity (e.g. when nucleic acid is mutated orimproperly modified). Treatment is of disorders characterized byaberrant expression or activity of the nucleic acid.

Disorders in which the aminopeptidase expression is relevant include,but are not limited to, those discussed herein and particularly in lungand colon carcinoma and insulin-related disorder, such as diabetes.

Alternatively, a modulator for aminopeptidase nucleic acid expressioncan be a small molecule or drug identified using the screening assaysdescribed herein as long as the drug or small molecule inhibits theaminopeptidase nucleic acid expression.

The aminopeptidase polynucleotides are also useful for monitoring theeffectiveness of modulating compounds on the expression or activity ofthe aminopeptidase gene in clinical trials or in a treatment regimen.Thus, the gene expression pattern can serve as a barometer for thecontinuing effectiveness of treatment with the compound, particularlywith compounds to which a patient can develop resistance. The geneexpression pattern can also serve as a marker indicative of aphysiological response of the affected cells to the compound.Accordingly, such monitoring would allow either increased administrationof the compound or the administration of alternative compounds to whichthe patient has not become resistant. Similarly, if the level of nucleicacid expression falls below a desirable level, administration of thecompound could be commensurately decreased.

Monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression of a specified mRNA orgenomic DNA of the invention in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the mRNA or genomic DNAin the post-administration samples; (v) comparing the level ofexpression or activity of the mRNA or genomic DNA in thepre-administration sample with the mRNA or genomic DNA in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The aminopeptidase polynucleotides are also useful in diagnostic assaysfor qualitative changes in aminopeptidase nucleic acid, and particularlyin qualitative changes that lead to pathology. The polynucleotides canbe used to detect mutations in aminopeptidase genes and gene expressionproducts such as mRNA. The polynucleotides can be used as hybridizationprobes to detect naturally-occurring genetic mutations in theanninopeptidase gene and thereby to determine whether a subject with themutation is at risk for a disorder caused by the mutation. Mutationsinclude deletion, addition, or substitution of one or more nucleotidesin the gene, chromosomal rearrangement, such as inversion ortransposition, modification of genoric DNA, such as aberrant methylationpatterns or changes in gene copy number, such as amplification.Detection of a mutated form of the aminopeptidase gene associated with adysfunction provides a diagnostic tool for an active disease orsusceptibility to disease when the disease results from overexpression,underexpression, or altered expression of an aminopeptidase.

Mutations in the aminopeptidase gene can be detected at the nucleic acidlevel by a variety of techniques. Genomic DNA can be analyzed directlyor can be amplified by using PCR prior to analysis. RNA or cDNA can beused in the same way.

In certain embodiments, detection of the mutation involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g. U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) PNAS91:360-364), the latter of which can be particularly useful fordetecting point mutations in the gene (see Abravaya et al. (1995)Nucleic Acids Res. 23:675-682). This method can include the steps ofcollecting a sample of cells from a patient, isolating nucleic acid(e.g., genomic, mRNA or both) from the cells of the sample, contactingthe nucleic acid sample with one or more primers which specificallyhybridize to a gene under conditions such that hybridization andamplification of the gene (if present) occurs, and detecting thepresence or absence of an amplification product, or detecting the sizeof the amplification product and comparing the length to a controlsample. Deletions and insertions can be detected by a change in size ofthe amplified product compared to the normal genotype. Point mutationscan be identified by hybridizing amplified DNA to normal RNA orantisense DNA sequences.

It is anticipated that PCR and/or LCR may be desirable to use as apreliminary amplification step in conjunction with any of the techniquesused for detecting mutations described herein.

Alternative amplification methods include: self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well-known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

Alternatively, mutations in an aminopeptidase gene can be directlyidentified, for example, by alterations in restriction enzyme digestionpatterns determined by gel electrophoresis.

Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can beused to score for the presence of specific mutations by development orloss of a ribozyme cleavage site.

Perfectly matched sequences can be distinguished from mismatchedsequences by nuclease cleavage digestion assays or by differences inmelting temperature.

Sequence changes at specific locations can also be assessed by nucleaseprotection assays such as RNase and S1 protection or the chemicalcleavage method.

Furthermore, sequence differences between a mutant aminopeptidase geneand a wild-type gene can be determined by direct DNA sequencing. Avariety of automated sequencing procedures can be utilized whenperforming the diagnostic assays ((1995) Biotechniques 19:448),including sequencing by mass spectrometry (see, e.g., PCT InternationalPublication No. WO 94/16101; Cohen et al. (1996) Adv. Chromatogr.36:127-162; and Griffin et al. (1993) Appl. Biochem. Biotechnol.38:147-159).

Other methods for detecting mutations in the gene include methods inwhich protection from cleavage agents is used to detect mismatched basesin RNA/RNA or RNA/DNA duplexes (Myers et al. (1985) Science 230:1242);Cotton et al. (1988) PNAS 85:4397; Saleeba et al. (1992) Meth. Enzymol.217:286-295), electrophoretic mobility of mutant and wild type nucleicacid is compared (Orita et al. (1989) PNAS 86:2766; Cotton et al. (1993)Mutat. Res. 285:125-144; and Hayashi et al. (1992) Genet. Anal. Tech.Appl. 9:73-79), and movement of mutant or wild-type fragments inpolyacrylamide gels containing a gradient of denaturant is assayed usingdenaturing gradient gel electrophoresis (Myers et al. (1985) Nature313:495). The sensitivity of the assay may be enhanced by using RNA(rather than DNA), in which the secondary structure is more sensitive toa change in sequence. In one embodiment, the subject method utilizesheteroduplex analysis to separate double stranded heteroduplex moleculeson the basis of changes in electrophoretic mobility (Keen et al. (1991)Trends Genet. 7:5). Examples of other techniques for detecting pointmutations include, selective oligonucleotide hybridization, selectiveamplification, and selective primer extension.

In other embodiments, genetic mutations can be identified by hybridizinga sample and control nucleic acids, e.g., DNA or RNA, to high densityarrays containing hundreds or thousands of oligonucleotide probes(Cronin et al. (1996) Human Mutation 7:244-255; Kozal et al. (1996)Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two dimensional arrays containing light-generated DNAprobes as described in Cronin et al. supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

The aminopeptidase polynucleotides are also useful for testing anindividual for a genotype that while not necessarily causing thedisease, nevertheless affects the treatment modality. Thus, thepolynucleotides can be used to study the relationship between anindividual's genotype and the individual's response to a compound usedfor treatment (pharmacogenomic relationship). In the present case, forexample, a mutation in the aminopeptidase gene that results in alteredaffinity for zinc could result in an excessive or decreased drug effectwith standard concentrations of zinc. Accordingly, the aminopeptidasepolynucleotides described herein can be used to assess the mutationcontent of the gene in an individual in order to select an appropriatecompound or dosage regimen for treatment.

Thus polynucleotides displaying genetic variations that affect treatmentprovide a diagnostic target that can be used to tailor treatment in anindividual. Accordingly, the production of recombinant cells and animalscontaining these polymorphisms allow effective clinical design oftreatment compounds and dosage regimens.

The methods can involve obtaining a control biological sample from acontrol subject, contacting the control sample with a compound or agentcapable of detecting mRNA, or genomic DNA, such that the presence ofmRNA or genomic DNA is detected in the biological sample, and comparingthe presence of mRNA or genomic DNA in the control sample with thepresence of mRNA or genomic DNA in the test sample.

The aminopeptidase polynucleotides are also useful for chromosomeidentification when the sequence is identified with an individualchromosome and to a particular location on the chromosome. First, theDNA sequence is matched to the chromosome by in situ or otherchromosome-specific hybridization. Sequences can also be correlated tospecific chromosomes by preparing PCR primers that can be used for PCRscreening of somatic cell hybrids containing individual chromosomes fromthe desired species. Only hybrids containing the chromosome containingthe gene homologous to the primer will yield an amplified fragment.Sublocalization can be achieved using chromosomal fragments. Otherstrategies include prescreening with labeled flow-sorted chromosomes andpreselection by hybridization to chromosome-specific libraries. Furthermapping strategies include fluorescence in situ hybridization, whichallows hybridization with probes shorter than those traditionally used.Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on the chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

The aminopeptidase polynucleotides can also be used to identifyindividuals from small biological samples. This can be done for exampleusing restriction fragment-length polymorphism (RFLP) to identify anindividual. Thus, the polynucleotides described herein are useful as DNAmarkers for RFLP (See U.S. Pat. No. 5,272,057).

Furthermore, the aminopeptidase sequence can be used to provide analternative technique, which determines the actual DNA sequence ofselected fragments in the genome of an individual. Thus, theaminopeptidase sequences described herein can be used to prepare two PCRprimers from the 5′ and 3′ ends of the sequences. These primers can thenbe used to amplify DNA from an individual for subsequent sequencing.

Panels of corresponding DNA sequences from individuals prepared in thismanner can provide unique individual identifications, as each individualwill have a unique set of such DNA sequences. It is estimated thatallelic variation in humans occurs with a frequency of about once pereach 500 bases. Allelic variation occurs to some degree in the codingregions of these sequences, and to a greater degree in the noncodingregions. The aminopeptidase sequences can be used to obtain suchidentification sequences from individuals and from tissue. The sequencesrepresent unique fragments of the human genome. Each of the sequencesdescribed herein can, to some degree, be used as a standard againstwhich DNA from an individual can be compared for identificationpurposes.

If a panel of reagents from the sequences is used to generate a uniqueidentification database for an individual, those same reagents can laterbe used to identify tissue from that individual. Using the uniqueidentification database, positive identification of the individual,living or dead, can be made from extremely small tissue samples.

The aminopeptidase polynucleotides can also be used in forensicidentification procedures. PCR technology can be used to amplify DNAsequences taken from very small biological samples, such as a singlehair follicle, body fluids (e.g. blood, saliva, or semen). The amplifiedsequence can then be compared to a standard allowing identification ofthe origin of the sample.

The aminopeptidase polynucleotides can thus be used to providepolynucleotide reagents, e.g., PCR primers, targeted to specific loci inthe human genome, which can enhance the reliability of DNA-basedforensic identifications by, for example, providing another“identification marker” (i.e. another DNA sequence that is unique to aparticular individual). As described above, actual base sequenceinformation can be used for identification as an accurate alternative topatterns formed by restriction enzyme generated fragments. Sequencestargeted to the noncoding region are particularly useful since greaterpolymorphism occurs in the noncoding regions, making it easier todifferentiate individuals using this technique.

The aminopeptidase polynucleotides can further be used to providepolynucleotide reagents, e.g., labeled or labelable probes which can beused in, for example, an in situ hybridization technique, to identify aspecific tissue. This is useful in cases in which a forensic pathologistis presented with a tissue of unknown origin. Panels of aminopeptidaseprobes can be used to identify tissue by species and/or by organ type.

In a similar fashion, these primers and probes can be used to screentissue culture for contamination (i.e. screen for the presence of amixture of different types of cells in a culture).

Alternatively, the aminopeptidase polynucleotides can be used directlyto block transcription or translation of aminopeptidase gene sequencesby means of antisense or ribozyme constructs. Thus, in a disordercharacterized by abnormally high or undesirable aminopeptidase geneexpression, nucleic acids can be directly used for treatment.

The aminopeptidase polynucleotides are thus useful as antisenseconstructs to control aminopeptidase gene expression in cells, tissues,and organisms. A DNA antisense polynucleotide is designed to becomplementary to a region of the gene involved in transcription,preventing transcription and hence production of aminopeptidase protein.An antisense RNA or DNA polynucleotide would hybridize to the mRNA andthus block translation of mRNA into aminopeptidase protein.

Examples of antisense molecules useful to inhibit nucleic acidexpression include antisense molecules complementary to a fragment ofthe 5′ untranslated region of SEQ ID NO 2 which also includes the startcodon and antisense molecules which are complementary to a fragment ofthe 3′ untranslated region of SEQ ID NO 2.

Alternatively, a class of antisense molecules can be used to inactivatemRNA in order to decrease expression of aminopeptidase nucleic acid.Accordingly, these molecules can treat a disorder characterized byabnormal or undesired aminopeptidase nucleic acid expression. Thistechnique involves cleavage by means of ribozymes containing nucleotidesequences complementary to one or more regions in the mRNA thatattenuate the ability of the mRNA to be translated. Possible regionsinclude coding regions and particularly coding regions corresponding tothe catalytic and other functional activities of the aminopeptidaseprotein.

The aminopeptidase polynucleotides also provide vectors for gene therapyin patients containing cells that are aberrant in aminopeptidase geneexpression. Thus, recombinant cells, which include the patient's cellsthat have been engineered ex vivo and returned to the patient, areintroduced into an individual where the cells produce the desiredaminopeptidase protein to treat the individual.

The invention also encompasses kits for detecting the presence of anaminopeptidase nucleic acid in a biological sample. For example, the kitcan comprise reagents such as a labeled or labelable nucleic acid oragent capable of detecting aminopeptidase nucleic acid in a biologicalsample; means for determining the amount of aminopeptidase nucleic acidin the sample; and means for comparing the amount of aminopeptidasenucleic acid in the sample with a standard. The compound or agent can bepackaged in a suitable container. The kit can further compriseinstructions for using the kit to detect aminopeptidase mRNA or DNA.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and Microsoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware includes, but is not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

Vectors/Host Cells

The invention also provides vectors containing the aminopeptidasepolynucleotides. The term “vector” refers to a vehicle, preferably anucleic acid molecule that can transport the aminopeptidasepolynucleotides. When the vector is a nucleic acid molecule, theaminopeptidase polynucleotides are covalently linked to the vectornucleic acid. With this aspect of the invention, the vector includes aplasmid, single or double stranded phage, a single or double strandedRNA or DNA viral vector, or artificial chromosome, such as a BAC, PAC,YAC, OR MAC.

A vector can be maintained in the host cell as an extrachromosomalelement where it replicates and produces additional copies of theaminopeptidase polynucleotides. Alternatively, the vector may integrateinto the host cell genome and produce additional copies of theanrinopeptidase polynucleotides when the host cell replicates.

The invention provides vectors for the maintenance (cloning vectors) orvectors for expression (expression vectors) of the aminopeptidasepolynucleotides. The vectors can function in procaryotic or eukaryoticcells or in both (shuttle vectors).

Expression vectors contain cis-acting regulatory regions that areoperably linked in the vector to the arninopeptidase polynucleotidessuch that transcription of the polynucleotides is allowed in a hostcell. The polynucleotides can be introduced into the host cell with aseparate polynucleotide capable of affecting transcription. Thus, thesecond polynucleotide may provide a trans-acting factor interacting withthe cis-regulatory control region to allow transcription of theaminopeptidase polynucleotides from the vector.

Alternatively, a trans-acting factor may be supplied by the host cell.Finally, a trans-acting factor can be produced from the vector itself.

It is understood, however, that in some embodiments, transcriptionand/or translation of the aminopeptidase polynucleotides can occur in acell-free system.

The regulatory sequence to which the polynucleotides described hereincan be operably linked include promoters for directing mRNAtranscription. These include, but are not limited to, the left promoterfrom bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, theearly and late promoters from SV40, the CMV immediate early promoter,the adenovirus early and late promoters, and retrovirus long-terminalrepeats.

In addition to control regions that promote transcription, expressionvectors may also include regions that modulate transcription, such asrepressor binding sites and enhancers. Examples include the SV40enhancer, the cytomegalovirus immediate early enhancer, polyomaenhancer, adenovirus enhancers, and retrovirus LTR enhancers.

In addition to containing sites for transcription initiation andcontrol, expression vectors can also contain sequences necessary fortranscription termination and, in the transcribed region a ribosomebinding site for translation. Other regulatory control elements forexpression include initiation and termination codons as well aspolyadenylation signals. The person of ordinary skill in the art wouldbe aware of the numerous regulatory sequences that are useful inexpression vectors. Such regulatory sequences are described, forexample, in Sambrook et al. (1989) Molecular Cloning: A LaboratoryManual 2nd. ed., Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.).

A variety of expression vectors can be used to express an aminopeptidasepolynucleotide. Such vectors include chromosomal, episomal, andvirus-derived vectors, for example vectors derived from bacterialplasmids, from bacteriophage, from yeast episomes, from yeastchromosomal elements, including yeast artificial chromosomes, fromviruses such as baculoviruses, papovaviruses such as SV40, Vacciniaviruses, adenoviruses, poxviruses, pseudorabies viruses, andretroviruses. Vectors may also be derived from combinations of thesesources such as those derived from plasmid and bacteriophage geneticelements, e.g. cosmids and phagemids. Appropriate cloning and expressionvectors for prokaryotic and eukaryotic hosts are described in Sambrooket al. (1989) Molecular Cloning: A Laboratory Manual 2nd. ed., ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

The regulatory sequence may provide constitutive expression in one ormore host cells (i.e. tissue specific) or may provide for inducibleexpression in one or more cell types such as by temperature, nutrientadditive, or exogenous factor such as a hormone or other ligand. Avariety of vectors providing for constitutive and inducible expressionin prokaryotic and eukaryotic hosts are well known to those of ordinaryskill in the art.

The aminopeptidase polynucleotides can be inserted into the vectornucleic acid by well-known methodology. Generally, the DNA sequence thatwill ultimately be expressed is joined to an expression vector bycleaving the DNA sequence and the expression vector with one or morerestriction enzymes and then ligating the fragments together. Proceduresfor restriction enzyme digestion and ligation are well known to those ofordinary skill in the art.

The vector containing the appropriate polynucleotide can be introducedinto an appropriate host cell for propagation or expression usingwell-known techniques. Bacterial cells include, but are not limited to,E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cellsinclude, but are not limited to, yeast, insect cells such as Drosophila,animal cells such as COS and CHO cells, and plant cells.

As described herein, it may be desirable to express the polypeptide as afusion protein. Accordingly, the invention provides fusion vectors thatallow for the production of the aminopeptidase polypeptides. Fusionvectors can increase the expression of a recombinant protein, increasethe solubility of the recombinant protein, and aid in the purificationof the protein by acting for example as a ligand for affinitypurification. A proteolytic cleavage site may be introduced at thejunction of the fusion moiety so that the desired polypeptide canultimately be separated from the fusion moiety. Proteolytic enzymesinclude, but are not limited to, factor Xa, thrombin, and enterokinase.Typical fusion expression vectors include pGEX (Smith et al. (1988) Gene67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5(Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase(GST), maltose E binding protein, or protein A, respectively, to thetarget recombinant protein. Examples of suitable inducible non-fusion E.coli expression vectors include pTrc (Amann et al. (1988) Gene69:301-315) and pET 11d (Studier et al. (1990) Gene ExpressionTechnology: Methods in Enzymology 185:60-89).

Recombinant protein expression can be maximized in a host bacteria byproviding a genetic background wherein the host cell has an impairedcapacity to proteolytically cleave the recombinant protein. (Gottesman,S. (1990) Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. 119-128). Alternatively, the sequenceof the polynucleotide of interest can be altered to provide preferentialcodon usage for a specific host cell, for example E. coli. (Wada et al.(1992) Nucleic Acids Res. 20:2111-2118).

The aminopeptidase polynucleotides can also be expressed by expressionvectors that are operative in yeast. Examples of vectors for expressionin yeast e.g., S. cerevisiae include pYepSec1 (Baldari et al. (1987)EMBO J. 6:229-234 ), pMFa (Kurjan et al. (1982) Cell 30:933-943), pJRY88(Schultz et al. (1987) Gene 54:113-123), and pYES2 (InvitrogenCorporation, San Diego, Calif.).

The aminopeptidase polynucleotides can also be expressed in insect cellsusing, for example, baculovirus expression vectors. Baculovirus vectorsavailable for expression of proteins in cultured insect cells (e.g., Sf9 cells) include the pAc series (Smith et al. (1983) Mol. Cell Biol.3:2156-2165) and the pVL series (Lucklow et al. (1989) Virology170:31-39).

In certain embodiments of the invention, the polynucleotides describedherein are expressed in mammalian cells using mammalian expressionvectors. Examples of mammalian expression vectors include pCDM8 (Seed,B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J.6:187-195).

The expression vectors listed herein are provided by way of example onlyof the well-known vectors available to those of ordinary skill in theart that would be useful to express the aminopeptidase polynucleotides.The person of ordinary skill in the art would be aware of other vectorssuitable for maintenance propagation or expression of thepolynucleotides described herein. These are found for example inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual 2nd, ed.,Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y.

The invention also encompasses vectors in which the nucleic acidsequences described herein are cloned into the vector in reverseorientation, but operably linked to a regulatory sequence that permitstranscription of antisense RNA. Thus, an antisense transcript can beproduced to all, or to a portion, of the polynucleotide sequencesdescribed herein, including both coding and non-coding regions.Expression of this antisense RNA is subject to each of the parametersdescribed above in relation to expression of the sense RNA (regulatorysequences, constitutive or inducible expression, tissue-specificexpression).

The invention also relates to recombinant host cells containing thevectors described herein. Host cells therefore include prokaryoticcells, lower eukaryotic cells such as yeast, other eukaryotic cells suchas insect cells, and higher eukaryotic cells such as mammalian cells.

The recombinant host cells are prepared by introducing the vectorconstructs described herein into the cells by techniques readilyavailable to the person of ordinary skill in the art. These include, butare not limited to, calcium phosphate transfection,DEAE-dextran-mediated transfection, cationic lipid-mediatedtransfection, electroporation, transduction, infection, lipofection, andother techniques such as those found in Sambrook et al. (MolecularCloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

Host cells can contain more than one vector. Thus, different nucleotidesequences can be introduced on different vectors of the same cell.Similarly, the aminopeptidase polynucleotides can be introduced eitheralone or with other polynucleotides that are not related to theaminopeptidase polynucleotides such as those providing trans-actingfactors for expression vectors. When more than one vector is introducedinto a cell, the vectors can be introduced independently, co-introducedor joined to the aminopeptidase polynucleotide vector.

In the case of bacteriophage and viral vectors, these can be introducedinto cells as packaged or encapsulated virus by standard procedures forinfection and transduction. Viral vectors can be replication-competentor replication-defective. In the case in which viral replication isdefective, replication will occur in host cells providing functions thatcomplement the defects.

Vectors generally include selectable markers that enable the selectionof the subpopulation of cells that contain the recombinant vectorconstructs. The marker can be contained in the same vector that containsthe polynucleotides described herein or may be on a separate vector.Markers include tetracycline or ampicillin-resistance genes forprokaryotic host cells and dihydrofolate reductase or neomycinresistance for eukaryotic host cells. However, any marker that providesselection for a phenotypic trait will be effective.

While the mature proteins can be produced in bacteria, yeast, mammaliancells, and other cells under the control of the appropriate regulatorysequences, cell-free transcription and translation systems can also beused to produce these proteins using RNA derived from the DNA constructsdescribed herein.

Where secretion of the polypeptide is desired, appropriate secretionsignals are incorporated into the vector. The signal sequence can beendogenous to the aminopeptidase polypeptides or heterologous to thesepolypeptides.

Where the polypeptide is not secreted into the medium, the protein canbe isolated from the host cell by standard disruption procedures,including freeze thaw, sonication, mechanical disruption, use of lysingagents and the like. The polypeptide can then be recovered and purifiedby well-known purification methods including ammonium sulfateprecipitation, acid extraction, anion or cationic exchangechromatography, phosphocellulose chromatography, hydrophobic-interactionchromatography, affinity chromatography, hydroxylapatite chromatography,lectin chromatography, or high performance liquid chromatography.

It is also understood that depending upon the host cell in recombinantproduction of the polypeptides described herein, the polypeptides canhave various glycosylation patterns, depending upon the cell, or maybenon-glycosylated as when produced in bacteria. In addition, thepolypeptides may include an initial modified methionine in some cases asa result of a host-mediated process.

Uses of Vectors and Host Cells

It is understood that “host cells” and “recombinant host cells” refernot only to the particular subject cell but also to the progeny orpotential progeny of such a cell. Because certain modifications mayoccur in succeeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term as usedherein.

The host cells expressing the polypeptides described herein, andparticularly recombinant host cells, have a variety of uses. First, thecells are useful for producing aminopeptidase proteins or polypeptidesthat can be further purified to produce desired amounts ofaminopeptidase protein or fragments. Thus, host cells containingexpression vectors are useful for polypeptide production.

Host cells are also useful for conducting cell-based assays involvingthe aminopeptidase or aminopeptidase fragments. Thus, a recombinant hostcell expressing a native aminopeptidase is useful to assay for compoundsthat stimulate or inhibit aminopeptidase function. This includes zinc orpeptide binding, gene expression at the level of transcription ortranslation, and interaction with other cellular components.

Host cells are also useful for identifying aminopeptidase mutants inwhich these functions are affected. If the mutants naturally occur andgive rise to a pathology, host cells containing the mutations are usefulto assay compounds that have a desired effect on the mutantaminopeptidase (for example, stimulating or inhibiting function) whichmay not be indicated by their effect on the native aminopeptidase.

Recombinant host cells are also useful for expressing the chimericpolypeptides described herein to assess compounds that activate orsuppress activation by means of a heterologous domain, segment, site,and the like, as disclosed herein.

Further, mutant aminopeptidases can be designed in which one or more ofthe various functions is engineered to be increased or decreased andused to augment or replace aminopeptidase proteins in an individual.Thus, host cells can provide a therapeutic benefit by replacing anaberrant aminopeptidase or providing an aberrant aminopeptidase thatprovides a therapeutic result. In one embodiment, the cells provideaminopeptidases that are abnormally active.

In another embodiment, the cells provide aminopeptidases that areabnormally inactive. These aminopeptidases can compete with endogenousaminopeptidases in the individual.

In another embodiment, cells expressing aminopeptidases that cannot beactivated, are introduced into an individual in order to compete withendogenous aminopeptidases for zinc or peptide. For example, in the casein which excessive zinc is part of a treatment modality, it may benecessary to effectively inactivate zinc at a specific point intreatment. Providing cells that compete for the molecule, but whichcannot be affected by aminopeptidase activation would be beneficial.

Homologously recombinant host cells can also be produced that allow thein situ alteration of endogenous aminopeptidase polynucleotide sequencesin a host cell genome. This technology is more fully described in WO93/09222, WO 91/12650 and U.S. Pat. No. 5,641,670. Briefly, specificpolynucleotide sequences corresponding to the aminopeptidasepolynucleotides or sequences proximal or distal to an aminopeptidasegene are allowed to integrate into a host cell genome by homologousrecombination where expression of the gene can be affected. In oneembodiment, regulatory sequences are introduced that either increase ordecrease expression of an endogenous sequence. Accordingly, anaminopeptidase protein can be produced in a cell not normally producingit, or increased expression of aminopeptidase protein can result in acell normnally producing the protein at a specific level. Alternatively,the entire gene can be deleted. Still further, specific mutations can beintroduced into any desired region of the gene to produce mutantaminopeptidase proteins. Such mutations could be introduced, forexample, into the specific regions disclosed herein.

In one embodiment, the host cell can be a fertilized oocyte or embryonicstem cell that can be used to produce a transgenic animal containing thealtered aminopeptidase gene. Alternatively, the host cell can be a stemcell or other early tissue precursor that gives rise to a specificsubset of cells and can be used to produce transgenic tissues in ananimal. See also Thomas et al., Cell 51:503 (1987) for a description ofhomologous recombination vectors. The vector is introduced into anembryonic stem cell line (e.g., by electroporation) and cells in whichthe introduced gene has homologously recombined with the endogenousaminopeptidase gene is selected (see e.g., Li, E. et al. (1992) Cell69:915). The selected cells are then injected into a blastocyst of ananimal (e.g., a mouse) to form aggregation chimeras (see e.g., Bradley,A. in Teratocarcinomas and Embryonic Stem Cells: A Practical Approach,E. J. Robertson, ed. (IRL, Oxford, 1987) pp. 113-152). A chimeric embryocan then be implanted into a suitable pseudopregnant female fosteranimal and the embryo brought to term. Progeny harboring thehomologously recombined DNA in their germ cells can be used to breedanimals in which all cells of the animal contain the homologouslyrecombined DNA by germline transmission of the transgene. Methods forconstructing homologous recombination vectors and homologous recombinantanimals are described further in Bradley, A. (1991) Current Opinion inBiotechnology 2:823-829 and in PCT International Publication Nos. WO90/11354; WO 91/01140; and WO 93/04169.

The genetically engineered host cells can be used to produce non-humantransgenic animals. A transgenic animal is preferably a mammal, forexample a rodent, such as a rat or mouse, in which one or more of thecells of the animal include a transgene. A transgene is exogenous DNAwhich is integrated into the genome of a cell from which a transgenicanimal develops and which remains in the genome of the mature animal inone or more cell types or tissues of the transgenic animal. Theseanimals are useful for studying the function of an aminopeptidaseprotein and identifying and evaluating modulators of aminopeptidaseprotein activity.

Other examples of transgenic animals include non-human primates, sheep,dogs, cows, goats, chickens, and amphibians.

In one embodiment, a host cell is a fertilized oocyte or an embryonicstem cell into which aminopeptidase polynucleotide sequences have beenintroduced.

A transgenic animal can be produced by introducing nucleic acid into themale pronuclei of a fertilized oocyte, e.g., by microinjection,retroviral infection, and allowing the oocyte to develop in apseudopregnant female foster animal. Any of the aminopeptidasenucleotide sequences can be introduced as a transgene into the genome ofa non-human animal, such as a mouse.

Any of the regulatory or other sequences useful in expression vectorscan form part of the transgenic sequence. This includes intronicsequences and polyadenylation signals, if not already included. Atissue-specific regulatory sequence(s) can be operably linked to thetransgene to direct expression of the aminopeptidase protein toparticular cells.

Methods for generating transgenic animals via embryo manipulation andmicroinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No.4,873,191 by Wagner et al. and in Hogan, B., Manipulating the MouseEmbryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1986). Similar methods are used for production of other transgenicanimals. A transgenic founder animal can be identified based upon thepresence of the transgene in its genome and/or expression of transgenicmRNA in tissues or cells of the animals. A transgenic founder animal canthen be used to breed additional animals carrying the transgene.Moreover, transgenic animals carrying a transgene can further be bred toother transgenic animals carrying other transgenes. A transgenic animalalso includes animals in which the entire animal or tissues in theanimal have been produced using the homologously recombinant host cellsdescribed herein.

In another embodiment, transgenic non-human animals can be producedwhich contain selected systems, which allow for regulated expression ofthe transgene. One example of such a system is the cre/loxP recombinasesystem of bacteriophage P1. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. (1992) PNAS 89:6232-6236.Another example of a recombinase system is the FLP recombinase system ofS. cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355. If acre/loxP recombinase system is used to regulate expression of thetransgene, animals containing transgenes encoding both the Crerecombinase and a selected protein is required. Such animals can beprovided through the construction of “double” transgenic animals, e.g.,by mating two transgenic animals, one containing a transgene encoding aselected protein and the other containing a transgene encoding arecombinase.

Clones of the non-human transgenic animals described herein can also beproduced according to the methods described in Wilmut et al. (1997)Nature 385:810-813 and PCT International Publication Nos. WO 97/07668and WO 97/07669. In brief, a cell, e.g., a somatic cell, from thetransgenic animal can be isolated and induced to exit the growth cycleand enter G_(o) phase. The quiescent cell can then be fused, e.g.,through the use of electrical pulses, to an enucleated oocyte from ananimal of the same species from which the quiescent cell is isolated.The reconstructed oocyte is then cultured such that it develops tomorula or blastocyst and then transferred to a pseudopregnant femalefoster animal. The offspring born of this female foster animal will be aclone of the animal from which the cell, e.g., the somatic cell, isisolated.

Transgenic animals containing recombinant cells that express thepolypeptides described herein are useful to conduct the assays describedherein in an in vivo context. Accordingly, the various physiologicalfactors that are present in vivo and that could affect binding oractivation, may not be evident from in vitro cell-free or cell-basedassays. Accordingly, it is useful to provide non-human transgenicanimals to assay in vivo aminopeptidase function, including peptideinteraction, the effect of specific mutant aminopeptidases onaminopeptidase function and peptide interaction, and the effect ofchimeric aminopeptidases. It is also possible to assess the effect ofnull mutations, that is mutations that substantially or completelyeliminate one or more aminopeptidase functions.

Pharmaceutical Compositions

The aminopeptidase nucleic acid molecules, protein, modulators of theprotein, and antibodies (also referred to herein as “active compounds”)can be incorporated into pharmaceutical compositions suitable foradministration to a subject, e.g., a human. Such compositions typicallycomprise the nucleic acid molecule, protein, modulator, or antibody anda pharmaceutically acceptable carrier.

The term “administer” is used in its broadest sense and includes anymethod of introducing the compositions of the present invention into asubject. This includes producing polypeptides or polynucleotides in vivoby in vivo transcription or translation of polynucleotides that havebeen exogenously introduced into a subject. Thus, polypeptides ornucleic acids produced in the subject from the exogenous compositionsare encompassed in the term “administer.”

As used herein the language “pharmaceutically acceptable carrier” isintended to include any and all solvents, dispersion media, coatings,antibacterial and antifungal agents, isotonic and absorption delayingagents, and the like, compatible with pharmaceutical administration. Theuse of such media and agents for pharmaceutically active substances iswell known in the art. Except insofar as any conventional media or agentis incompatible with the active compound, such media can be used in thecompositions of the invention. Supplementary active compounds can alsobe incorporated into the compositions. A pharmaceutical composition ofthe invention is formulated to be compatible with its intended route ofadministration. Examples of routes of administration include parenteral,e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation),transdermal (topical), transmucosal, and rectal administration.Solutions or suspensions used for parenteral, intradermal, orsubcutaneous application can include the following components: a sterilediluent such as water for injection, saline solution, fixed oils,polyethylene glycols, glycerine, propylene glycol or other syntheticsolvents; antibacterial agents such as benzyl alcohol or methylparabens; antioxidants such as ascorbic acid or sodium bisulfite;chelating agents such as ethylenediaminetetraacetic acid; buffers suchas acetates, citrates or phosphates and agents for the adjustment oftonicity such as sodium chloride or dextrose. pH can be adjusted withacids or bases, such as hydrochloric acid or sodium hydroxide. Theparenteral preparation can be enclosed in ampules, disposable syringesor multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersion. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent which delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., an aminopeptidase protein or anti- aminopeptidaseantibody) in the required amount in an appropriate solvent with one or acombination of ingredients enumerated above, as required, followed byfiltered sterilization. Generally, dispersions are prepared byincorporating the active compound into a sterile vehicle which containsa basic dispersion medium and the required other ingredients from thoseenumerated above. In the case of sterile powders for the preparation ofsterile injectable solutions, the preferred methods of preparation arevacuum drying and freeze-drying which yields a powder of the activeingredient plus any additional desired ingredient from a previouslysterile-filtered solution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For oral administration, the agent can be contained in entericforms to survive the stomach or further coated or mixed to be releasedin a particular region of the GI tract by known methods. For the purposeof oral therapeutic administration, the active compound can beincorporated with excipients and used in the form of tablets, troches,or capsules. Oral compositions can also be prepared using a fluidcarrier for use as a mouthwash, wherein the compound in the fluidcarrier is applied orally and swished and expectorated or swallowed.Pharmaceutically compatible binding agents, and/or adjuvant materialscan be included as part of the composition. The tablets, pills,capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth or gelatin; an excipient suchas starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compounds are delivered in theform of an aerosol spray from pressured container or dispenser, whichcontains a suitable propellant, e.g., a gas such as carbon dioxide, or anebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g.,with conventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. “Dosage unit form” as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated; each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470) or by stereotactic injection(see e.g., Chen et al. (1994) PNAS 91:3054-3057). The pharmaceuticalpreparation of the gene therapy vector can include the gene therapyvector in an acceptable diluent, or can comprise a slow release matrixin which the gene delivery vehicle is imbedded. Alternatively, where thecomplete gene delivery vector can be produced intact from recombinantcells, e.g. retroviral vectors, the pharmaceutical preparation caninclude one or more cells which produce the gene delivery system.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

1. An isolated nucleic acid molecule that encodes a polypeptide havingan amino peptidase protein activity, selected from the group consistingof: a) a nucleic acid molecule comprising a nucleotide sequence which isat least 70% identical to the nucleotide sequence of SEQ ID NO:2, SEQ IDNO:3, or the nucleotide sequence of the DNA insert of the plasmiddeposited with ATCC as Accession Number PTA-1642; b) a nucleic acidmolecule comprising a nucleotide sequence which is at least 80%identical to the nucleotide sequence of SEQ ID NO:2, SEQ ID NO:3, or thenucleotide sequence of the DNA insert of the plasmid deposited with ATCCas Accession Number PTA-1642; c) a nucleic acid molecule comprising anucleotide sequence which is at least 90% identical to the nucleotidesequence of SEQ ID NO:2, SEQ ID NO:3, or the nucleotide sequence of theDNA insert of the plasmid deposited with ATCC as Accession NumberPTA-1642; d) a nucleic acid molecule that hybridizes to a nucleic acidmolecule comprising SEQ ID NO:2, SEQ ID NO:3, or a complement thereof,under stringent conditions, said stringent conditions comprisinghybridization in 6×sodium chloride/sodium citrate (SSC) at about 45° C.,followed by one or more washes in 0.2×SSC., 0.1% SDS at 50-65° C.
 2. Thenucleic acid molecule of claim 1 further comprising vector nucleic acidsequences.
 3. The nucleic acid molecule of claim 1 further comprisingnucleic acid sequences encoding a heterologous polypeptide.
 4. A hostcell which contains the nucleic acid molecule of claim
 2. 5. The hostcell of claim 4 which is a mammalian host cell.
 6. A non-human mammalianhost cell containing the nucleic acid molecule of claim
 1. 7. Anisolated polypeptide having an amino peptidase protein activity,selected from the group consisting of: a) a polypeptide which is encodedby a nucleic acid molecule comprising a nucleotide sequence which is atleast 70% identical to a nucleic acid comprising the nucleotide sequenceof SEQ ID NO:2, SEQ ID NO:3, or the nucleotide sequence of the DNAinsert of the plasmid deposited with ATCC as Accession Number PTA- 1642;b) a polypeptide which is encoded by a nucleic acid molecule comprisinga nucleotide sequence which is at least 80% identical to the nucleotidesequence of SEQ ID NO:2, SEQ ID NO:3, or the nucleotide sequence of theDNA insert of the plasmid deposited with ATCC as Accession NumberPTA-1642; c) a polypeptide which is encoded by a nucleic acid moleculecomprising a nucleotide sequence which is at least 90% identical to thenucleotide sequence of SEQ ID NO:2, SEQ ID NO:3, or the nucleotidesequence of the DNA insert of the plasmid deposited with ATCC asAccession Number PTA-1642; d) a polypeptide encoded by a nucleic acidmolecule which hybridizes to a nucleic acid molecule comprising SEQ IDNO:2, SEQ ID NO:3, or a complement thereof under stringent conditions,said stringent conditions comprising hybridization in 6×sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC., 0.1% SDS at 50-65° C; and e) a fragment of apolypeptide comprising the amino acid sequence of SEQ ID NO:1, or theamino acid sequence encoded by the cDNA insert of the plasmid depositedwith the ATCC as Accession Number PTA-1642, wherein the fragmentcomprises at least 50 contiguous amino acids of SEQ ID NO:2.
 8. Theisolated polypeptide of claim 7 comprising the amino acid sequence ofSEQ ID NO:1.
 9. The polypeptide of claim 7 further comprisingheterologous amino acid sequences.
 10. An antibody which selectivelybinds to a polypeptide of claim
 7. 11. A method for producing apolypeptide comprising culturing the host cell of claim 4 underconditions in which the nucleic acid molecule is expressed.
 12. A methodfor detecting the presence of a polypeptide of claim 7 in a sample,comprising: a) contacting the sample with a compound which selectivelybinds to a polypeptide of claim 7; and b) determining whether thecompound binds to the polypeptide in the sample.
 13. A method fordetecting the presence of a polypeptide of claim 8 in a sample,comprising: a) contacting the sample with a compound which selectivelybinds to a polypeptide of claim 8; and b) determining whether thecompound binds to the polypeptide in the sample.
 14. The method of claim13, wherein the compound which binds to the polypeptide is an antibody.15. A kit comprising a compound which selectively binds to a polypeptideof claim 8 and instructions for use.
 16. A method for detecting thepresence of a nucleic acid molecule of claim 1 in a sample, comprisingthe steps of: a) contacting the sample with a nucleic acid probe orprimer which selectively hybridizes to the nucleic acid molecule; and b)determining whether the nucleic acid probe or primer binds to a nucleicacid molecule in the sample.
 17. The method of claim 16, wherein thesample comprises mRNA molecules and is contacted with a nucleic acidprobe.
 18. A kit comprising a compound which selectively hybridizes to anucleic acid molecule of claim 1 and instructions for use.
 19. A methodfor identifying a compound which binds to a polypeptide of claim 8comprising the steps of: a) contacting a polypeptide, or a cellexpressing a polypeptide of claim 8 with a test compound; and b)determining whether the polypeptide binds to the test compound.
 20. Themethod of claim 19, wherein the binding of the test compound to thepolypeptide is detected by a method selected from the group consistingof: a) detection of binding by direct detecting of testcompound/polypeptide binding; b) detection of binding using acompetition binding assay; c) detection of binding using an assay foraminopeptidase protein activity.
 21. A method for modulating theactivity of a polypeptide of claim 8 comprising contacting a polypeptideor a cell expressing a polypeptide of claim 8 with a compound whichbinds to the polypeptide in a sufficient concentration to modulate theactivity of the polypeptide.
 22. A method for identifying a compoundwhich modulates the activity of a polypeptide of claim 8, comprising: a)contacting a polypeptide of claim 8 with a test compound; and b)determining the effect of the test compound on the activity of thepolypeptide to thereby identify a compound which modulates the activityof the polypeptide.